On today’s Internet, database driven or dynamic sites are very popular. Unfortunately the easiest way to pass information between your pages is with a query string. In case you don’t know what a query string is, it's a string of information tacked onto the end of a URL after a question mark.
So, what’s the problem with that? Well, most search engines (with a few exceptions - namely Google) will not index any pages that have a question mark or other character (like an ampersand or equals sign) in the URL. So all of those popular dynamic sites out there aren’t being indexed - and what good is a site if no one can find it?
The solution? Search engine friendly URLs. There are a few popular ways to pass information to your pages without the use of a query string, so that search engines will still index those individual pages. I'll cover 3 of these techniques in this article. All 3 work in PHP with Apache on Linux (and while they may work in other scenarios, I cannot confirm that they do).
Method 1: PATH_INFO
Implementation:
If you look above this article on the address bar, you’ll see a URL like this:
http://www.webmasterbase.com/article.php/999/12. SitePoint actually uses the PATH_INFO method to create their dynamic pages.
Apache has a "look back" feature that scans backwards down the URL if it doesn’t find what it's looking for. In this case there is no directory or file called "12", so it looks for "999". But it find that there's not a directory or file called "999" either, so Apache continues to look down the URL and sees "article.php". This file does exist, so Apache calls up that script. Apache also has a global variable called $PATH_INFO that is created on every HTTP request. What this variable contains is the script that's being called, and everything to the right of that information in the URL. So in the example we've been using, $PATH_INFO will contain article.php/999/12.
So, you wonder, how do I query my database using article.php/999/12? First you have to split this into variables you can use. And you can do that using PHP’s explode function:
Code:
$var_array = explode("/",$PATH_INFO);
Once you do that, you’ll have the following information:
Code:
$var_array[0] = "article.php"
$var_array[1] = 999
$var_array[2] = 12
So you can rename $var_array[1] as $article and $var_array[2] as $page_num and query your database.
Drawback:
There was previously one major drawback to this method. Google, and perhaps other search engines, would not index pages set up in this manner, as they interpreted the URL as being malformed. I contacted a Software Developer at Google and made them aware of the problem and I am happy to announce that it is now fixed.
There is the potential that other search engines may ignore pages set up in this manner. While I don't know of any, I can't be certain that none do. If you do decide to use this method, be sure to monitor your server logs for spiders to ensure that your site is being indexed as it should.