Saturday, October 30, 2021

[SOLVED] What is the meaning of ^ and $ in Apache HTTPD RewriteRule?

Issue

I have successfully added the following code to my Apache HTTPD configuration:

# Force www.
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ https://www.%{HTTP_HOST}/$1 [R=301,L]
# Force https (SSL)
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

Although it works as expected, I have a theoretical question:

Why are there a ^ and $ in 3rd line enforcing "www.", and not in the 6th line enforcing "https"?

Sincerely, Dovid.


Solution

For both of your regex patterns ^(.*)$ and (.*) will behave same. However guess what, you don't need to use any of them. In fact it is far less error prone also to not to use .* and use %{REQUEST_URI} variable that matches full URI (not the relative one like .*). So I suggest change your rules to this:

# Force www.
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L,NE]

# Force https (SSL)
RewriteCond %{HTTPS} off
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L,NE]
  • Flag NE is used for not escaping. It is useful to have this flag in case your original URI has some special characters like # or (,),[,] etc.
  • ^ in RewriteRule pattern above does nothing but returns true for every match since ^ means start position of a string and it will be always match.
  • Both rules can be combined into a single rule but it will look a bit complicated.

Here it is:

RewriteCond %{HTTP_HOST} !^www\. [NC,OR]
RewriteCond %{HTTPS} !on
RewriteCond %{HTTP_HOST} ^(?:www\.)?(.+)$ [NC]
RewriteRule ^ https://www.%1%{REQUEST_URI} [R=301,L,NE]

Here is the explanation of this rule:

  • RewriteCond %{HTTP_HOST} !^www\. [NC,OR]: if HOST_NAME doesn't start with www.
  • [NC,OR]: Ignore case match and ORs next condition
  • RewriteCond %{HTTPS} !on: HTTPS is not turned on
  • RewriteCond %{HTTP_HOST} ^(?:www\.)?(.+)$ [NC]: This condition will always match since www. is an optional match here. It is used to capture substring of HTTP_HOST without starting www. by using (.+) pattern in capture group #1 (to be back-referenced as %1 later). Note that (?:..) is a non-capturing group.
  • RewriteRule ^ https://www.%1%{REQUEST_URI} [R=301,L,NE]: ^ will always match. This rule will redirect to https://www.%1%{REQUEST_URI} with R=301 code by adding https:// and www. to %1. %1 is back-reference of capture group #1 from RewriteCond, as mentioned above.


Answered By - anubhava