iEntry 10th Anniversary RSS Contact


Google Is Still Working On The Content Duplication Issue

By: Navneet Kaushal
2009-10-07

It seems Google is working hard on the content duplication issue. After introducing a new parameter for handling duplicate content problems, Google is now talking about reunifying duplicate content in its latest Google Webmaster Official Blog.

Site owners have always found it difficult to handle duplicate content on their site. Websites grow with time, new features get added, changes are introduced and removed, content is edited, add & removed. After some time, many websites collect systematic cruft as multiple URLs that show same content. If there is duplicate content on the site, it won't create a problem but search engines will find it difficult to crawl and index the pages. Along with this, the PageRank and related information that is found via incoming links can also get diffused across the pages and you won't recognize them as duplicates. But, it will lower down the page rank of your preferred version in Google. So, here are a few steps to deal with the duplicate content in your site:


Recognizing Duplicate Content on the Website


The primary step is to recognize the duplicate content on website. You can take a unique text snippet from a page to lead the process. Search for the snippet by using a site:query in Google as this will limit the results to pages of in own website. If there are multiple results for the same content, it means content duplication.



Determining Preferred URLs


Before you fix the duplicate content issue, find out your preferred URL structure. Determine the URLs that you would prefer to use for the particular content.


Having a Consistency Within the Website


After choosing the preferred URLs, you have to ensure that these are used in all the possible locations within the site. This will include Sitemap file also.


Applying 301 Permanent Redirects


Try to redirect duplicate URLs to the preferred URLs with the help of a 301 response code as it will help the visitors and search engines in locating your preferred URLs. If the website is available on several domain names, you can pick and use 301 redirect from other domains but make sure that you forward the right page and not only the root of the domain. In case you support both www and non-www host names, choose one, use the preferred domain settings in the Webmaster Tools and redirect.



Implementing the rel="canonical" Link Element Where you Can on Your Pages


If there are pages where 301 redirects are not possible, the rel="canonical" link element will give a better understanding of the site and preferred URLs. Major search engines like Yahoo!, Bing and Ask.com also support the use of this link element.


Using URL Parameter Handling Tool in Google Webmaster Tools


In case some or all the duplicate content of website is coming from URLs with query parameters, this tool will be of great help. It will help site owners in notifying Google about important and irrelevant parameters within the URLs of the site. Robots.text File You can disallow crawling of duplicate content with robots.txt file. Google is now recommending site owners not to block access to the duplicate site content by robots.txt file or other methods. It is infact asking to use the rel="canonical" link element, 301 redirects and URL parameter handling tool. In case you totally block the access to duplicate content, search engines will start treating those URLs as unique pages as they will not be able to know that these pages are just different URLs for same content. Therefore, it is always better to let these URLs be crawled but you should clearly mark them as duplicate through one of the above mentioned methods recommended by Google. In case yo allow Google to crawl these URLs, Googlebot will identify duplicate content by looking at the URL, therefore, you should avoid unnecessary recrawls. There are a few cases where duplicate content leads to too much crawling of website. You can avoid this by adjusting the crawl rate setting in the Webmaster Tools.



Comments




About the Author:
Nav is the founder and CEO of PageTraffic, a premier search engine company known for its assured SEO service, web design and development, copywriting and full time SEO professionals.

Navneet has wide experience in natural search engine optimization, internet marketing and PPC campaigns. He is a prolific writer and his articles can be found in the "Best Articles" section of many websites and article banks. As a search engine analyst , he has over 9 years of experience and his knowledge is in application here.


Visit the SearchNewz Directory
Do you have a search site?
Submit it free to the internet's best search industry directory. » Click Here
Search Engines
Google, Yahoo, MSN...

Search Marketing
Marketing, Budget, Planning...

Pay Per Click
Bid, Price, Quality...
SEO Companies
Optimization, Manage, Company...

SEO Tools
Track, Search, Create...

Analytics
Statistics, Counter...
» Submit your site for FREE «

Latest News

Get Your Site Submitted for Free in the World's Largest B2B Directory!

Email Address:
* URL:
*
*Indicates Mandatory Field

Terms & Conditions



Titan Quest Forum Nintendo Wii Graphics Forum
Halo 3 Forum Mac Software

Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Archive SearchNewz.com Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Signup Subscribe to our feeds!