Phillip Pearson - web + electronics notes

tech notes and web hackery from a new zealander who was vaguely useful on the web back in 2002 (see: python community server, the blogging ecosystem, the new zealand coffee review, the internet topic exchange).

2008-6-9

Discovery: you *can* break out of a set of RewriteRules

I'm not sure if this is a new thing in Apache 2.0, or if it's been there all along, as I've never seen anyone use this technique, and I've been looking for something like it for years.

Normally mod_rewrite runs through the set of rules, and if none of them match, it passes the URI on to Apache. If any of them match, it applies the changes, then runs through the rules again. It keeps going until it runs through the rules without any matches. This is often exactly not what you want. For example, let's say you want to dispatcher.php to handle all URIs:

RewriteRule (.*) dispatcher.php/$1

This rewrites foo/bar to dispatcher.php/foo/bar, which then gets rewritten again to dispatcher.php/dispatcher.php/foo/bar, ad nauseaum. The solution here is to add a RewriteCond line preventing anything starting with dispatcher.php/ from being rewritten:

RewriteCond %{REQUEST_URI} ! ^dispatcher.php/
RewriteRule (.*) dispatch.php/$1

This is fine if you run everything through a dispatcher script, but often you'll want to serve static files directly. A common solution is to change the RewriteCond to only rewrite URIs that don't match files in the DocumentRoot:

RewriteCond %{REQUEST_FILENAME} ! -f
RewriteRule (.*) dispatch.php/$1

Then, some of the time you'll realise you've got a whole pile of security holes in your system due to how you've organised your directory tree: instead of storing all your library files outside the DocumentRoot (i.e. if /var/www/mysite/htdocs is the DocumentRoot, your libraries would go in /var/www/mysite/lib, your templates in /var/www/mysite/templates, your database password etc in /var/www/mysite/conf) you've got everything underneath the DocumentRoot, and people can access undesirable files directly. In that case the ! -f rule won't do; instead, you want something that validates that the URL only matches a set of 'safe' patterns, and rejecting everything else.

Here's something that will work (excepting any typos):

# First, add a rule for every valid rewritten URI, pointing nowhere, with [L]
RewriteRule ^([a-z]+\.php)$ - [L]
RewriteRule ^images/ - [L]
RewriteRule ^style/ - [L]
RewriteRule ^script/ - [L]

# Now your rewrites
RewriteRule ^my/([^/]+)$ my.php?action=$1 [L]
RewriteRule ^stuff/([^/]+)$ stuff.php?action=$1 [L]

# And forbid anything else
RewriteRule .* - [F]

The trick here is that the [L] flag normally only terminates mod_rewrite's current pass through the rules, but if a RewriteRule doesn't make any changes and also uses [L], mod_rewrite stops completely and passes the rewritten URL straight back to Apache. Here are some example URLs:

- foobar.php → matches the first rule, passed through as-is.

- images/foo.jpg → matches the second rule, passed through as-is.

- my/profile → matches the fifth rule, rewritten to my.php?action=profile. Then on the second pass through the rules, it matches the first rule and is passed through.

- etc/passwd → matches the final rule, resulting in a 403 response.

... more like this: []