in General

Case-insensitive mapping with mod_rewrite’s RewriteMap

Sometimes when you need to manage a massive pile of URL redirections — for instance, when you’re playing snatch-the-tablecloth with your web platform — it’s handy to mash them all together with mod_rewrite’s RewriteMap feature.

I hit a frustrating stumbling block with a recent project, however: What happens if you need your RewriteMap key to be case-insensitive? This is incredibly likely if you’re migrating away from an undisciplined Windows-based web platform.

It was immediately obvious that I’d have to match against consistent casing, but not at all obvious how this would work in the context of mod_rewrite. After some trial and error, here’s what I came up with:

RewriteMap lowercase int:tolower
RewriteMap urls dbm:/srv/www.example.com/urls.db
RewriteCond %{REQUEST_URI} ^(/.*\.html)$
RewriteCond ${lowercase:%1} ^(/.*\.html)$
RewriteCond ${urls:%1|NOT_FOUND} !NOT_FOUND
RewriteRule ^(/.*\.html)$ http://%{HTTP_HOST}${urls:%1}? [R=permanent,L]

Here’s how it works:

  1. First we set up the built in lowercase translation rewrite map… handy!
  2. Then we define our great big URL map, which is just a list of lowercase URLs (yes, they’re lowercased in the file) and their destinations.
  3. Make sure we’re only mucking around with html URLs, but also grab the REQUEST_URI for use further along (see the parentheses in the second parameter).
  4. We use Apache’s handy internal lowercase map to convert the previous line’s match to lowercase, and we grab the whole thing again… but this time, what we’re grabbing is the output of the lowercase map.
  5. Tricky little hack to make sure our URL is represented in the redirection map. If there’s no match, the default value returned is NOT_FOUND… but we only continue to the next line if the map does not return NOT_FOUND.
  6. Finally, permanently redirect to the destination URL, as matched against the lowercase original.

Rock and roll.

Write a Comment

Comment

Comments will be sent to the moderation queue.

  1. You can avoid both the NOT_FOUND hack and a second map lookup (though it will be cached, it might be faster to do a regexp match/capture than a lookup ; not sure, though) with something like this:
    RewriteCond ${urls:%1} ^(.+)$
    RewriteRule ^(.*)$ http://%{HTTP_HOST}%1? [R=permanent,L]

    You can also change the second match to use (.*) instead of (/.*\.html) which you already know will match, since it matched in the first one, or you may want to match to use (.*) on the first one and (/.*\.html) on the second if you can have urls ending with .HTML.

  2. So, what exactly would the rewrite map look like in this case (i use text)? I want to be able to accomplish anything coming in to the web site, say uwmedicine.washington.edu, and any destination – file or directory, and match it with a destination.
    Would it be this [key[space]value]?
    /patientcare/ http://uwmedicine.washington.edu/patient-care/

    so if a user typed in /PatientCare/ it will find a match based on /patientcare/.