iEntry 10th Anniversary RSS Contact


Google Process For Detecting Near Duplicate Content!

By: Navneet Kaushal
2008-02-28

For many webmasters, duplicate content always has been a persistent issue. Google even went ahead and got a duplicated content detection patent. However, now Google went even...

further and has come up with another patent application, developed by Monika H. Henzinger.

This new Google patent application explores how duplicate and near duplicate content might be detected at different web addresses. It further uses some different and existing methods for detecting near duplicate content. With the increasing popularity of blogs and RSS syndication, duplication of copies has also gone many folds high. With the result that now search engine companies including Google are doing their best to cut on the duplication of copies.

The patent research provides citations to a number of documents on the Web that explore the topic of duplicate and near duplicate content, including one of the processes developed by Moses Charikar, a Princeton professor, who is listed as the inventor of a Google patent, granted early last year. It discusses ways to detect similar content on the Web - Methods and apparatus for estimating similarity.

From those documents, Dr. Henzinger tests and explores approaches from each. While there were differences in how effective these approaches were according to tests run, the conclusion about their effectiveness in the patent application was that neither of the algorithms worked well for finding near-duplicate pairs on the same Website, though both achieved high precision for near-duplicate pairs on different Websites.

The patent research paper concluded that, These near-duplicate detection techniques performed well, particularly when analyzing Web pages from the same Website. These techniques did so without sacrificing much in the number of returned correct pairs.

Though we can not say that with this new patent duplication and near duplication would fully come to an end, but yes, things are going to be better. As they say, Something is better than nothing.

Comments

Tag:

Add to Del.icio.us | Digg | Reddit | Furl

Have a bookmark! -


About the Author:
Nav is the founder and CEO of PageTraffic, a premier search engine company known for its assured SEO service, web design and development, copywriting and full time SEO professionals.

Navneet has wide experience in natural search engine optimization, internet marketing and PPC campaigns. He is a prolific writer and his articles can be found in the "Best Articles" section of many websites and article banks. As a search engine analyst , he has over 9 years of experience and his knowledge is in application here.


Visit the SearchNewz Directory
Do you have a search site?
Submit it free to the internet's best search industry directory. » Click Here
Search Engines
Google, Yahoo, MSN...

Search Marketing
Marketing, Budget, Planning...

Pay Per Click
Bid, Price, Quality...
SEO Companies
Optimization, Manage, Company...

SEO Tools
Track, Search, Create...

Analytics
Statistics, Counter...
» Submit your site for FREE «

Latest News

Get Your Site Submitted for Free in the World's Largest B2B Directory!

Email Address:
* URL:
*
*Indicates Mandatory Field

Terms & Conditions



Titan Quest Forum Nintendo Wii Graphics Forum
Halo 3 Forum Mac Software

Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Archive SearchNewz.com Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Signup Subscribe to our feeds!