View Full Version : Pretty URLs with basic mod_rewrite and powerful options in PHP

01-26-2010, 09:37 AM
Warning: This tutorial is a brief explanation of some very advanced concepts. It is not possible to do this without a solid understanding of general operations in PHP and some idea about how servers work in general. Use this at your own risk and always test first, hopefully on an extra server.
It is, however, designed to be easier than the (more) advanced methods typical of .htaccess, though it does not eliminate all complexity.

I have created a summary in this post:
Read the whole tutorial for a complete understanding, but all of the crucial information is there.
Also, it contains the finalized version, rather than the work-in-progress throughout some of the full tutorial.

I have been searching google for a simple answer to some of this for days and finally decided to go about this a simplified way that fits more into my area of knowledge: skip past the .htaccess/mod_rewrite and into php for the complex issues!
I thought it might help some others, or at least provide a brain teaser.

mod_rewrite is used to get rid of ugly links and make them pretty without compromising the structure of your website.
/index.php?var=value&var1=value1 is ugly.
/value/value1 is a lot nicer.

However, mod_rewrite is very complex. Obviously if you can figure out all of the details it is powerful and worth looking into-- but if you have trouble (like I have), then this is a very simple way to still get a lot of it's power.

First, you need to understand the very basics of mod_rewrite. This tutorial will not cover that. In short, google "mod rewrite" and find out how to turn it on. Once it is enabled on your server, you can control all of this complexity with .htaccess files.

A .htaccess file in the root (main) directory of your site will affect the entire site. If it is inside a subdirectory, it will only affect that and its internal subdirectories. You can decide what part(s) of your site you want to do.

Now, since we want to keep things simple, the idea is to redirect this to a php page that can handle everything there, rather than in the harder to work with htaccess.

For this, we will do a simple trick: send every single url to a single php page to decide what to do with it.

Here is an example .htaccess file inside the directory /test/:

RewriteEngine On
RewriteBase /test
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?/$1 [L]

Now EVERY request within the folder "/test" will be "forwarded" to the index.php page.

This page can be very dynamic, so don't worry about your content being limited to a single page.

On /test/index.php, you can use this code:
<?php echo $_SERVER['QUERY_STRING']; ?>

The QUERY_STRING is the string that comes after the filename:
http://example.com/test/index.php?var=1 ---> "/?var=1"
http://example.com/test/index.php/var/1 ---> "/var/1"

Now, remember, this is going through a hidden redirect, so the browser will actually show:

This is much cleaner.

Everything is running through that single page, though, and what do you do now?

$_SERVER['QUERY_STRING'] will hold the information of the "file" they wanted to access after the /test/ directory.

You will need to use string functions, or regular expressions if you want to deal with that (it can be confusing if you don't know what you're doing).

You can do some very basic parsing of that string and then use PHP includes to reorganize your site.

In other words, now you have all of the information available in PHP through the Query String and you can use it to handle internal redirection instead of .htaccess. It is completely customizable to your needs.

This is meant just to get you started, but it's how I'm proceeding on a current project. mod_rewrite is messy and difficult-- something for the future-- and for now this will get the job done.

1. If you DO NOT want a subdirectory, but instead your whole site, place that .htaccess file in the root (main) directory, then remove the line with "RewriteBase".

2. If you do make it site-wide (1), then you can actually create a php page-server for the entire site-- separate pages, separate directories and all served by include from a single hub page that decides what to do. Very powerful, though also complex.

3. Be Careful! Get variables, among other things, may be changed in unexpected ways. You can (though it is also messy and may get confusing) actually override the default values of $_GET, $_REQUEST, $_SERVER['QUERY_STRING'] (those globals that hold the value of variables from the URL). Again, Be Careful here!

4. This is in no way suggesting that it entirely replace mod_rewrite or a normal filesystem. However, it is a very simple way to approach the situation given that other options are difficult to implement without experience with some of the more advanced concepts behind them.

5. Important: to the end user, your site is actually being served from the URL that they use. Because of this, they will frequently be in another directory. Just like all options where the URL changes, you cannot use relative links: use absolute links, in the format "http://example.com/test.htm" or "/test.htm".

01-27-2010, 12:50 AM
The code above does work, but it also strips out all of the original get variables.
Because of that, here's a stronger approach:

New .htaccess:
RewriteEngine On
RewriteBase /test
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1?%{QUERY_STRING} [L]

New methods:
$_GET (and other forms) will now have the ORIGINAL get variables again.
$_SERVER['PATH_INFO'] must be used to determine the page to serve.

URL: http://example.com/test/a/b/c.htm?var=1
(mod_rewrite sends it to /test/index.php for processing)
On index.php:
$_GET is the normal values: array('var'=>1)
$_SERVER['PATH_INFO']: /a/b/c.htm

1. Parse $_SERVER['PATH_INFO'] to decide what to do.
2. $_GET can be used normally-- it is not affected at all.

01-27-2010, 01:49 AM
(Sorry this explanation is getting so messy. I'm finding more things as I test.)

Another important update:

The only problem with the setup above is that existing files are accessed directly.
--If you WANT existing files to be accessed directly, use the code above and ignore this.--
If you want ONLY dynamically served content-- you want the server to ignore existing files and run through the parser anyway, then here's how it works:

These lines tell the parser to NOT redirect if the file actually exists.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
The problem, though, is that if you remove those, an internal server error (500) is generated.

So, we must work around this somehow.

Considering it should be ok to access index.php, we can use this:

RewriteEngine On
RewriteBase /test
RewriteCond %{REQUEST_URI} !^/test/index.php$
RewriteCond %{REQUEST_URI} !^/test/index.php/
RewriteRule ^(.*)$ index.php/$1?%{QUERY_STRING} [L]

So now what you end up with is:
Unless the requested URL is http://example.com/test/index.php exactly, then the rewrite process will occur.

Now, everything is redirected to the index.php page for parsing (or goes there directly), and every other page in the directory is ignored.

Now you can serve anything you want using the index.php file to get it from any of the other files. This also creates a level of security: none of the files in that directory can be accessed directly (but of course can be with an include).

Finally: Now, there is a slight inconsistency in this:
$_SERVER['PATH_INFO'] will display '/example' for both of these URLs:
Though the second is very unlikely to ever occur, things can get a bit confusing. Consider this pair, for example:
Or, just that http://example.com/test/index.php will not give $_SERVER['PATH_INFO'] the expected value of /index.php

To solve this, work it out in the PHP. At the top of index.php, add the following:

if (strpos($_SERVER['REQUEST_URI'],'/test/index.php')===0) {
if (strpos($_SERVER['PATH_INFO'],'/index.php')!==0) {
$_SERVER['PATH_INFO'] = '/index.php'.$_SERVER['PATH_INFO'];
//now $_SERVER['PATH_INFO'] will work as expected.
(While the inner if statement is not really needed, this will then continue to function and not double /index.php if you rewrote the .htaccess somehow later. That would confuse things. Better to be safe this way.)

04-16-2010, 12:10 AM
In using this on another site, I just found one last issue:
$_SERVER['PATH_INFO'] will not be set if there is no data (at least on some servers).
In other words, add this line to the top of the page:
if (!isset($_SERVER['PATH_INFO'])) { $_SERVER['PATH_INFO'] = ''; }

06-17-2010, 01:36 AM
Here's a summary of everything above. For a complete understanding, read all of that, but the basics are here.

This tutorial is designed to show you how to create a single PHP page (and .htaccess file) that will make an entire directory (or site) dynamically served by PHP.

This is like normal mod_rewrite methods, except that the control is given entirely to PHP.

First, create a .htaccess file with the following information:
RewriteEngine On
RewriteBase /test
RewriteCond %{REQUEST_URI} !^/test/index.php$
RewriteCond %{REQUEST_URI} !^/test/index.php/
RewriteRule ^(.*)$ index.php/$1?%{QUERY_STRING} [L]
Note that "test" is the name of the directory (as in yoursite.com/test) where you want this to work. If you want it to apply to the whole site, use RewriteBase / and remove the subdirectory from the URIs as well.

Now create a file called "index.php" in that directory:

if (!isset($_SERVER['PATH_INFO'])) { $_SERVER['PATH_INFO'] = ''; }
if (strpos($_SERVER['REQUEST_URI'],'/test/index.php')===0) {
if (strpos($_SERVER['PATH_INFO'],'/index.php')!==0) {
$_SERVER['PATH_INFO'] = '/index.php'.$_SERVER['PATH_INFO'];
//now $_SERVER['PATH_INFO'] will work as expected.

$_SERVER['PATH_INFO'] will contain the original request uri and query string (like /folder/file.ext?var=value) and you can do whatever you'd like with it.

As a very simple example of what this can do, you can use the following:

$file = $_SERVER['PATH_INFO']; //get what was sent
$file = substr($file,0,strpos($file,'?')); //remove any query string ["get" variables]
if (strpos($file,'../')===0) { exit('Big Security Threat!'); } //don't allow higher level directories!!
include($file); //include the file

That will serve files normally*. In other words, now this effectively does nothing: just as if you typed anything into the URL without all of this setup.
So that's just an example of how you'd be able to create a "php fileserver".

The example above is not helpful (since it doesn't do anything new), but it's just a clear example of how you'd approach using this method.

(*Note: this will work fine for any sort of text file. It will NOT work for images, audio, video, or other files that require specific headers to be sent. At least I don't think it will, but it might depend on the server.)

For some more practical examples, consider the following:

Replace normal "get" variables with pretty URLs
Request URI: /home
Use PHP to translate that to: $_GET['page'] = 'home';
Now, you can use that just like:
Fake Request URI: index.php?page=home

How: 1. strip the initial slash; 2. make sure it doesn't contain any extra characters (ignore everything after a slash or after a question mark); 3. the variable should be clean now; 4. Just use normal practices for safety (for example, don't allow ../ at the beginning of it).

Hide file extensions:
Request URI: /page
Use PHP to translate that to 'page.php'
Now, it is like:
Fake Request URI: /page.php

How: 1. use normal security precautions and allow only a single word (valid filename-- AZaz09_ [and more if you want]); 2. include($that.'.php');

There are many possibilities with this, and I've been using it a lot recently to create dynamic websites. You can do anything you want with it and you're only limited by your knowledge of PHP and working with strings.

The one weakness of this is that if you want to serve images or other files that require special headers you will need to work this out in PHP (it's possible), or you might want to use a real mod_rewrite method that wouldn't have this problem (since PHP must use include() to serve the pages).