URL Rewriting Tutorial

Why should you consider using URL rewriting?

You have created a web site made of HTML pages, CSS stylesheets, and powerful scripts. Everything is looking nice and great except the ugly URLs that are linking to your scripts.

Consider this URL for example:

http://mysite/catalog.php?category=computers&prodid=18

Wouldn't it be better to have it written as:

http://mysite/catalog/computers/18

Obviously, the second URL looks more logical and is prettier. It does not expose the internal details of your script: no one will know which scripting language you're using and no hacker kiddy will have the idea to mess with your script arguments to try to break it.

The second URL is also search engine friendly: It will be indexed without any problems since it has no ? and no query string in it. At the contrary, the first URL won't probably be indexed correctly: some search engines would only index http://mysite/catalog.php, some would even simply ignore it considering that a dynamically generated page is not worth indexing.

Fortunately, the new URL rewriting feature introduced in Abyss Web Server 2.4 can help you having such nice URLs. Actually this feature has many other applications but we will only focus on basic URL rewriting for this tutorial.

The goal

Our goal is to make the server understand that an URL of the form

http://mysite/catalog/computers/XY

is equivalent to

http://mysite/catalog.php?category=computers&prodid=XY

where X and Y here are digits (0, 1, ...., or 9).

How to do so?

  • Open Abyss Web Server's console. In the Hosts table, press Configure in the row corresponding to the host to which you want to add the URL rewriting rule.
  • Select URL Rewriting
  • Press Add in the URL Rewriting Rules table
  • Enter /catalog/computers/([0-9][0-9]) in Virtual Path Regular Expression
  • Set If this rule matches to Perform an internal redirection
  • Set Redirect to to /catalog.php?category=computers&prodid=$1
  • Leave all the other options unchanged: their default values are fine for this example
  • Press OK
  • Press Restart to apply the modifications

Explanations

Congratulations, you have declared your first URL rewriting rule. But some explanations are required. Actually, when Abyss Web Server receives a request, it will test its virtual path against the URL rewriting rules.

When a visitor types http://mysite/catalog/computers/18 on his/her browser, Abyss Web Server will receive a request for the virtual path /catalog/computers/18.

The server will then test this virtual path against the URL rewriting rule that we have declared. Since our virtual path matches with the declared Virtual Path Regular Expression, the rule will apply.

/catalog/computers/18 matches with /catalog/computers/([0-9][0-9]). This regular expression contains a single backreference enclosed between ( and ). When matched with our example's virtual path, the value of this backreference $1 will be 18.

This backreference will be used to build the redirection virtual path. Remember that Redirect to is set to:

/catalog.php?category=computers&prodid=$1

$1 will be substituted with its value and the server will perform an internal redirection to:

/catalog.php?category=computers&prodid=18

Note that this redirection in this case is internal which means that it is transparent to the visitor and occurs only inside the server. For the visitor, the URL displayed in his browser will always be http://mysite/catalog/computers/18.

Enhancing the URL rewriting rule

If a visitor browses http://mysite/catalog/computers/3 or http://mysite/catalog/computers/2312, he/she will certainly receive a 404 Not found error. These URLs won't be caught by our URL rewriting rule since their virtual paths do no match with the regular expression we've defined: Our regular expression only matches with paths starting by /catalog/computers/ and followed by exactly two digits.

We could of course add several variation of the URL rewriting rule: one with a single digit, one with 3, one with 4, etc... But that's not practical. The best solution is to rewrite the regular expression and broaden its scope. The new regular expression will be:

/catalog/computers/([0-9]+)

The + character means that the expression will match with any string starting with /catalog/computers/ and followed by one or more digits.

Making it even more enhanced

So this URL rewriting feature seems interesting and useful: what should we do now if one want to extend it for other kinds of items in the catalog and to have URLs such as http://mysite/catalog/keyboards/61 and http://mysite/catalog/harddisks/54?

Having an URL rewriting rule for each kind of items is again not the best solution. We'll extend again the Virtual Path Regular Expression to be:

/catalog/(.+)/([0-9]+)

The dot . is a special character which matches with any character. .+ matches with any sequence of 1 or more characters. So the new regular expression will match with any path starting with /catalog/ followed by one or more characters, followed by /, followed by 1 or more digits.

With this new regular expression we have 2 backreferences: $1 will contain the match of (.+) and $2 will contain the match of ([0-9]+).

Redirect to should also be updated to reflect the changes in the backreferences:

/catalog.php?category=$1&prodid=$2

Conclusion

This example shows the basics of URL rewriting. More could be achieved by learning the syntax of regular expressions (Regular Expressions Syntax Basics article and the detailed PCRE reference) and by exploring the numerous configuration options available in the URL rewriting dialog.

See also