Add, Remove, or Change File Extensions with .htaccess

Category: Blog • Posted by Jeff Starr • Post Date:

A reader recently asked how to add or remove the .html file extension from various URLs. The solution of course is found in Apache’s miraculous rewrite module, mod_rewrite. Using a few well-written RewriteRule directives in the htaccess file of your site’s root or target directory, modifying file extensions in URIs is relatively simple. This tutorial explain how it's done.

Adding File Extensions

Strictly for the copy-&-paste enthusiasts, here is the complete code example for adding file extensions via .htaccess:

# add file extensions
RewriteEngine on
RewriteRule ^business/$ /business.html [R=301,L]
RewriteRule ^pleasure/$ /pleasure.htm  [R=301,L]
RewriteRule ^content/$  /content.php   [R=301,L]

Looks easy enough? Perhaps, but let’s have a closer look, just for kicks..

First line, initialize Apache's mod_rewrite. Then, at the beginning of each subsequent line, we summon the powerful RewriteRule directive, which instructs the server to match all instances of the first string and apply it according to the pattern described in the second string. Finally, we conclude each line by returning a 301 status code, thereby informing search engines and other clients that the address change is permanent. Note the target files need to exist on the server in order for this work (i.e., non-existent resources end up as 404 errors).

Digging a bit deeper into the target and pattern strings, we see that we have specified three directories, business, pleasure, and content. The caret symbol (^) and dollar sign ($) wrapping each directory name simply indicate the beginning and end of the string, respectively. After the RewriteRule has matched one of the target directories, the URL is rewritten according to the specified pattern string. For each of the directories in the example, the pattern string is simply the directory name with the desired file type added to the end.

Customizing and using this code is straightforward. Using as many lines as necessary, specify all directories for which you would like to add a file extension. Then, in the pattern string, specify the file name that you would like to use, along with the associated file extension. To rewrite the directory name as a file within another directory, simply change /pattern.php to /keyword/pattern.php or something crazy like /greedy/seo/keyword/pattern.php. It’s entirely up to you.

Removing File Extensions

Again, for all the copy-&-paste hounds out there, here is the complete code example for removing file extensions via .htaccess:

# remove file extensions
RewriteEngine on
RewriteRule ^(.*).html$ http://domain.tld/$1 [R=301,L]
RewriteRule ^(.*).htm$  http://domain.tld/$1 [R=301,L]
RewriteRule ^(.*).php$  http://domain.tld/$1 [R=301,L]

Now that we have seen a generalized example, let’s break it down..

First, we fire up Apache’s mod_rewrite. Then, we invoke the magical powers of the RewriteRule directive and proceed to declare our target string. In the example, we are targeting three different file types, .html, .htm, and .php. Each of these file types appears after a wildcard operator ((.*)) in order to match any file with such an extension. Finally, we specify the beginning and end of the target string with a caret symbol (^) and dollar sign ($), respectively. At this point, if you are customizing the code for your own use, replace the listed extension(s) with the one(s) you would like to have removed. As before, note that the target files need to exist on the server in order for this technique to work.

Now that we have defined our target strings, we want to specify their respective rewrite patterns. In the example, we assume that the target files are located in the site’s root directory (i.e., http://domain.tld/). Given that we want to remove the extensions of the target files and do not want to change their represented location, we simply append the matched file name to our specified domain using the rewrite variable, $1. This variable represents only the portion of the target string that is matched with the wildcard operator ((.*)). Thus, the file name without the extension is matched and subsequently rewritten as a subdirectory of our target domain.

The final portion in each of our rewrite directives ensures that our rewriting is returning SEO-friendly 301 status codes. By returning a 301 code for each rewrite, we are effectively telling search engines, browsers, and other clients that the address change is permanent. Passing such information to the search engines ensures that your pages retain the value of their inbound links. 301, baby.. 301.

Changing File Extensions

Last but not least, let's quickly look at a similar technique for changing file extensions. Here is the punchline:

# change file extensions
RewriteEngine on
RewriteRule ^(.*).html$ http://domain.tld/$1.axe [R=301,L]
RewriteRule ^(.*).htm$  http://domain.tld/$1.biz [R=301,L]
RewriteRule ^(.*).php$  http://domain.tld/$1.yay [R=301,L]

The logic used here essentially is the same as the previous two techniques (adding and removing extensions), so I won't go through it again. The only real difference is the addition of the desired file type on each of the target paths. So with this code in place, the following redirects will happen (in order):

  • Each request for a .html file is redirected to same-name .axe file
  • Each request for a .htm file is redirected to same-name .biz file
  • Each request for a .php file is redirected to same-name .yay file

Again, the logic behind these directives is explained in either of the previous two techniques, so you can check ’em out for more details on how these rewrites operate. And of course, remember to test thoroughly and make good backups of your files before making any changes; that way if anything unexpected happens, you can roll back to the previous working set of files. So yeah.. it's good times redirecting stuff with .htaccess ;)

Alternate Technique

Here is an alternate technique that a reader suggested to add .html file extensions:

# hide file extensions
<IfModule mod_rewrite.c>
	RewriteCond %{REQUEST_FILENAME} !-f
	RewriteCond %{REQUEST_FILENAME} !-d
	RewriteCond %{REQUEST_FILENAME}\.html -f
	RewriteRule ^(.+)$ $1\.html [R=301,L]
</IfModule>

I haven't tested this personally, but it looks legit on the face of it. Here's how the logic works:

  1. Check that the requested file does not exist
  2. Check that the requested directory does not exist
  3. Check that the .html version of the requested file does exist
  4. If all three of these conditions are met, then redirect all requests to their .html targets

The cool thing about this method is that it first checks to see if the target file exists before rewriting the URL. Something to maybe integrate into one of the previous techniques for more robust request handling. Remember always to keep healthy backups and test thoroughly before going live with anything.