iEntry 10th Anniversary RSS Contact


The $10,000 Robots.txt File

By: Aaron Wall
2007-07-20

I recently changed one of my robots.txt files pruning duplicate content pages to help more of the internal PageRank flow to the higher quality and better earning pages. In the process of doing...

...that I forgot that one of the most well linked to pages on the site had a similar URL as the noisy pages. About a week ago the site's search traffic halved (right after Google was unable to crawl and index the powerful URL). I fixed the error pretty quickly, but the site now has hundreds of pages stuck in Google's supplemental index, and I am out about $10,000 in profit for that one line of code!

Both Google and Yahoo support wildcards, but you really have to be careful when changing a robots.txt file because a line like this

Disallow: /*page

also blocks a file like this from being indexed in Google

beauty-pageants.php

Unless you are thinking of that in advance it is easy to make a mistake.

If you are trying to prune duplicate content for Google and are fine with it ranking in other search engines, you may want to make those directives specific for GoogleBot. If you make a directive for a specific robot, that bot will ignore your general robots directives in favor of following the more specific directives you created for it.

Google's webmaster guidelines and Yahoo!'s Search Blog both offer tips on how to format your robots.txt file.

Google also offers a free robots.txt test tool, which allows you to see how robots will respond to your robots.txt file, notifying you of any files that are blocked.

You can use Xenu link sleuth to generate a list of URLs from your site. Upload that URL list to the Google robots.txt test tool (currently in 5,000 character chunks...an arbitrary limit I am sure they will eventually lift).

Inside the webmaster console Google will also show you what pages are currently blocked by your robots.txt file, and let you view when Google tried to crawl the page and noticed it was blocked. Google also shows you what pages are 404 errors, which might be a good way to see if you have any internal broken links or external links pointing at pages that no longer exist.

Comments

Have a bookmark! -


About the Author:
Aaron Wall is the author of SEO Book, a dynamic website offering marketing tips and coverage of the search space, free SEO videos, and free SEO tools. He is a regular conference speaker, partner in Clientside SEM, and publishes dozens of independent websites.


Visit the SearchNewz Directory
Do you have a search site?
Submit it free to the internet's best search industry directory. » Click Here
Search Engines
Google, Yahoo, MSN...

Search Marketing
Marketing, Budget, Planning...

Pay Per Click
Bid, Price, Quality...
SEO Companies
Optimization, Manage, Company...

SEO Tools
Track, Search, Create...

Analytics
Statistics, Counter...
» Submit your site for FREE «

Latest News

Get Your Site Submitted for Free in the World's Largest B2B Directory!

Email Address:
* URL:
*
*Indicates Mandatory Field

Terms & Conditions



Titan Quest Forum Nintendo Wii Graphics Forum
Halo 3 Forum Mac Software

Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Archive SearchNewz.com Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Signup Subscribe to our feeds!