Practical eCommerce

OLM.net

Manage Subscriptions · Subscribe Now · F.A.Q.'s

HOME · Thursday, May 22, 2008

Search Engine Optimization

Search Engine Optimization: Avoid Complex URLs

Dynamic URLs problematic to search engines

By: Stephan Spencer
Comments: 0

Search engines tend to have problems fully indexing dynamic websites (in other words, sites that are hooked up to a database of content).

The kinds of sites that search engines have the biggest trouble with are ones that have overly complex URL structures, including numerous variables in the URL (marked by numerous occurrences of ampersands and equals signs as well as session IDs, user IDs and referral tracking codes). Matt Cutts, Senior Engineer for Google, said at Search Engine Strategies in San Jose this past August that you are safe if the number of variables in your URL is one or two, unless one of those two variables is named id (or something else resembling a session ID), in which case all bets are off.

Not only are overly complex URLs unfriendly to users who might copy the URL and paste it in an email to a friend, or add a link on their own website to that particular page deep within your site, they are also unfriendly to the search engine spiders because they are a tip-off that the page is dynamically generated and could lead to what is called a spider trap.

A spider trap exists when a search engine spider keeps following links to URLs that appear to be different from URLs that have already been explored, even though it is the same content.

Imagine for example a search engine spider coming to the site, getting assigned a session ID, which is then embedded in the URL of all the pages on the page. The next time a spider comes to this page, it gets a brand new session ID because your web server can’t detect it is the same spider that came a few minutes ago. This result is numerous copies of the same exact page getting indexed, which is obviously a bad result for the search engine and a bad result for the search engine’s users because of all this duplication of content.

The worst kind of spider traps result in the spider getting an infinite variety of URLs although the same limited set of pages. Each search engine has its own tolerance levels as to how many variables in the URL are acceptable. The idea, however, is to eliminate all signs of the dynamic nature of your pages from the URL, in other words removing all question marks, ampersands, equals signs, cgi-bin, user IDs, and session IDs from the URLs to make the page infinitely more palatable to the spiders.

Not only does a clean, simple URL eliminate the potential problems that you could have with getting that page indexed, but, as a bonus, you're also more likely to garner more "deep links" from other sites (i.e. links directly into one of your pages deep within your site) because the URL looks user-friendly, stable, and easy to copy-and-paste (into a web browser, email message, or web page editor).

The best approach is to replace all dynamic looking links with search engine friendly ones. Don’t be tempted just to take a short-cut approach and create a site map page with links to all these search engine friendly URLs, leaving all the remaining links intact across your site. This is because the URLs that you haven’t fixed will not enhance the link gain of the pages with the friendly URLs. You want to maximize your link gain by having as few variations in each URL as possible. Variations in the URLs lead to dilution of link gain because not all links are voting for the same page. Some of them are spread out, some of them voting for a version of the page with one URL and others voting with other versions of the same page at varying URLs.

Assuming you have a dynamic site that is not yet search engine friendly as far as the URLs are concerned, but you would like to make it so, you have three options:

One is to rewrite the URLs using a URL rewriting server module, like mod_rewrite (for Apache) or ISAPI_Rewrite (for IIS Server). Ask your server administrator for information about these modules.

A second option is to recode your ecommerce platform to not pass information through "query strings" but instead use the "path info" environment variable. In other words, you would recode your scripts to look for variables embedded within the directory names or the file name instead of the "query string," however this tends to be quite a bit more complicated to implement.

The third option is to use a third party hosted proxy serving solution – in other words, an Application Service Provider.

The first option is usually the preferable one, assuming you have the IT resources to implement it on your server and your server supports the technology required for URL re-writing. The second option is good if you can't do URL rewriting but have programming resource available along with access to the source code of your ecommerce platform.

AdvertisementInfopia

But if these two options are not feasible for whatever reason, you could use a third party solution that automatically corrects the URLs for you. This is particularly useful if you are caught in a middle of a code freeze, such as during the holiday season.

You may wonder where the new Google Sitemaps program fits in here. Well, until Google Sitemaps supports the ability for you to convey which variations of URLs point to the same content, it's an incomplete solution. That's because it will fail to aggregate the PageRank across all those URL variations. You may have five versions of a product page already indexed in Google, and Sitemaps could just exacerbate the problem by getting a sixth version indexed, rather than a collapsing of the five versions into one version with much higher PageRank.

No matter which approach you take, making your URLs search engine friendly will pay dividends.

Blinklist | Del.icio.us | Furl | Ma.gnolia | Newsvine | Spurl | Reddit | Technorati

Published on Sunday, January 01, 2006

Comments:

There are no comments posted for this article.

↑ Back to Top

Leave a comment:

Please enter the following security code exactly as it appears.


Comments are stripped of HTML code upon submission. All comments are submitted for approval prior to being published. Please allow up to 24 hours for the approval process to take place. Practical eCommerce reserves the right to remove any comment at any time for any reason.

 


Related Articles

Articles at Practical eCommerce related to Search Engine Optimization: Avoid Complex URLs:

Related Podcasts

Podcasts at Practical eCommerce related to Search Engine Optimization: Avoid Complex URLs:

RSS 2.0 Feeds

Atom 1.0 Feeds

Technorati Tags

Ecommerce Articles

Browse All Articles
Browse our complete archive of ecommerce articles.
Accounting, Management & Legal
Ecommerce articles related to managing a small business including ecommerce accounting, business strategy and legal considerations.
Conversion & Usability
Online business articles about converting web site visitors into customers and how to gauge and improve your business website's usability.
Development & Programming
Articles to help designers, developers and programmers create successful, search engine friendly ecommerce websites and improve existing ones.
Hosting, Infrastructure & Software
Articles for ecommerce businesses about ecommerce web hosting, business infrastructure, business strategy and helpful ecommerce & small business software.
Interviews & Profiles
Interviews with prominent ecommerce business personalities and profiles of successful online businesses.
Inventory & Shipping
Ecommerce articles about inventory management, ecommerce order fulfillment and product shipping considerations.
Marketing & Revenue Growth
Articles relating to online marketing, email marketing and using the Internet to growing your business.
Search Engine Optimization
Search engine optimization articles for ecommerce business owners, strategists, marketers and developers.
Shopping Carts & Online Payments
Articles covering ecommerce shopping cart platforms and options for choosing an online payment gateway.
Training & Education
Tutorials and articles providing training and education for ecommerce business owners and developers of ecommerce websites.

Search Articles

Ecommerce Community

Ecommerce Blogs
Read our blogs about ecommerce topics written by industry professionals.
Community Forum
Connect with other ecommerce professionals to trade advice and answers in our community forum.
Podcasts
Check out our ecommerce podcasts covering topics ranging from interviews to tutorials.
RSS Content Feeds
Subscribe to our RSS feeds and have fresh ecommerce content delivered to you.

Ecommerce Resources

Free Email Newsletter
Sign up for Ecommerce Notes, our free email newsletter for ecommerce business owners and developers.
Ecommerce Directory
Browse our directory of ecommerce products and services, or submit your own listing in our directory.
Ecommerce Glossary
Familiarize yourself with terminology or submit terms to help others with our Ecommerce Glossary.
Events Calendar
Find out about upcoming ecommerce events or invite other ecommerce professionals by posting your own event.
Press Releases
Browse ecommerce related press releases and post your own press release for distribution.
Ecommerce Store & Back Issues
Pick up back issues of Practical eCommerce magazine along with other merchandise from Practical Ecommerce

About Practical eCommerce

Frequently Asked Questions
Look at frequently asked questions regarded using our website, subscribing to our magazine and more.
Advertising Information
Information about advertising in Practical eCommerce magazine, on our website, or in our email newsletters.
Editorial Sharing
Learn about options for sharing our content with your visitors, customers or employees.
About Us
Learn more about Practical Ecommerce magazine and meet our staff.
Contact Us
Contact Practical Ecommerce at any time for more information. We'd love to hear from you.
AdvertisementClearCartEndiciaArial Software

Copyright 2007 Confluence Distribution, Inc. and Practical eCommerce.
All Rights Reserved.

Privacy PolicyConditions of UseContact Us