RSS Contact Newsletter


Make Your Website Robot-Friendly


Visit the SearchNewz Directory
Do you have a search site?
Submit it free to the internet's best search industry directory. » Click Here
Search Engines
Google, Yahoo, MSN...

Search Marketing
Marketing, Budget, Planning...

Pay Per Click
Bid, Price, Quality...
SEO Companies
Optimization, Manage, Company...

SEO Tools
Track, Search, Create...

Analytics
Statistics, Counter...
» Submit your site for FREE «

Latest News

Get Your Site Submitted for Free in the World's Largest B2B Directory!

Email Address:
* URL:
*
*Indicates Mandatory Field

Terms & Conditions



Titan Quest Forum Nintendo Wii Graphics Forum
Halo 3 Forum Mac Software
By: Barry Welford
2007-04-23

In November 2006, all the major search engines for once agreed on new Sitemap standards. Sitemaps.org set out the rules for sitemap files that all the major search engines would follow.

If you use a program such as GSiteCrawler, you can produce a full listing of all the web pages on your website in an XML file: the standard name for this file is sitemap.xml. The search engines do prefer a G-zipped version of this file, usually named sitemap.xml.gz. The GSiteCrawler program produces both versions. Although even Microsofts MSN/Live subscribed to this standard, as yet they have not indicated how they wish to implement the standard. The other majors have been more helpful.

A good way to start is via the website for Googles Webmaster Tools. Once you have loaded your sitemap file to your domain, you can submit this to Google. An advantage of this approach is that Google will then in due course evaluate the sitemap file and indicate any errors therein.

The real news came up last week when Google, Yahoo! and Ask indicated that another route to inform them of the sitemap file is to include a reference to the precise URL for the sitemap file in the robots.txt file. Every domain should have a robots.txt file, even if it is an empty file. Search engine robots (or spiders) will sometimes visit a domain and check only the robots.txt file. This confirms that the domain is live. Without such a file, an error is recorded. Now you can add anywhere in the file, say at the bottom, an additional line that reads as follows:
Sitemap: http://www.yoursite.com/sitemap.xml.gz

The robots.txt file is normally checked often by search engine spiders. By doing the above, you should quickly get the new file picked up. Ask, Google and Yahoo! are all using this robots.txt file approach.

If you have just loaded up a sitemaps file and want to be sure that the sitemap file is picked up ASAP, you can ping the search engines directly. The following hyperlinks are the appropriate way to do this.

Ask:
http://submissions.ask.com/ping? sitemap=http%3A%2F%2Fwww.yoursite.com%2Fsitemap.xml.gz

Google:
http://www.google.com/webmasters/sitemaps/ping? sitemap=http:%3A%2F%2Fwww.yoursite.com%2Fsitemap.xml.gz

Yahoo:
http://search.yahooapis.com/SiteExplorerService/V1/ping? sitemap=http%3A%2F%2Fwww.yoursite.com%2Fsitemap.xml.gz

NOTE: The space after ping? should be removed. It is included here to improve the formatting of the blog post.

This should provide all the information you need on the sitemap file and how to alert the search engine robots that you have one. If there are additional points, hopefully someone will add them in the comments.

Comments

Tag: , , , , ,

Add to Del.icio.us | Digg | Reddit | Furl

Have a bookmark! -


About the Author:
Barry Welford, President of SMM Internet Marketing Consultants works with business owners and senior management on Internet Marketing strategy and action plans to grow their companies. He is a moderator at the Cre8asite Forums and writes on current issues on the Internet and on the Mobile Web in three blogs, BPWrap, StayGoLinks and The Other Bloke's Blog.
Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Archive SearchNewz.com Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Signup Subscribe to our feeds!