iEntry 10th Anniversary RSS Contact


PDFs In Google Search Results

By: Benj Arriola
2008-08-14

Looking into the first document in the sample SERP above deeper by opening the file in Adobe Acrobat and clicking on Document Properties this is what we get:

PDF Document Properties showing Metadata

Does PDF metadata have any significant effect on ranking?



This is probably something we all need to test further to come up with a good theory showing consistent results that is repeatable. And if it is worth the time to test because of it's significance to your business.

There are factors we are aware of, and there are also factors we really do not know until tested further and we can only make good intelligent guesses. But then again, it is a guess. Nevertheless, my best guesses based on this observation are:

  • Google reads Metadata in PDFs.

  • The Author will appear in the SERP description snippets if placed in the Metadata properties.

  • Your PDF will get the extra screen real estate in the SERPs that will hopefully be more enticing to click.



But continue reading more into this blog post and it seems Google know more than the metadata.

"Cited by:" Google SERPs Description Snippet



Cited By: Appearing in Google SERPs for PDF Documents

After tracing my sources of information, it seems this was first asked by TomHTML on GoogleBlogoscoped where Philip Lensen followed up with a blog about it. As mentioned in the forum:


When you are looking for PDF files, Google now displays authors of the PDF file and publication date. Only when it is available.
google.com/search?q=site%3Amem ...

It may be related to Google Scholar ("cited by...")
google.com/search?hl=en&q= ...

Have you ever seen that before?


I decided to check this further. I checked the first in the result of this query, the same example used on GoogleBlogoscoped.



As explained above, the Author in the SERPs is most probably pulled from Metadata of this PDF file. Now going to the Cited by link goes to Google Scholar and shows this on the page:



I downloaded the PDF file and started to look for the citation. And in the whole document does not mention the former document expect in the references page.

PDF References page, showing citation in SERPs

I'm quite surprised knowing that there is not even a link. It was just simply mentioned in the reference pages. Looks like Google is showing some advanced bibliography deciphering. Bibliographies are where citations in books are officially placed. Bibliography writing has various writing format standard rules that may only differ slightly from book to book. If Google can read bibliographies well, then this is like a whole new different kind of PDF interconnectivity. This may or may not have any bearing on the rank a PDF document in the universal search results but I see no reason not to. If you have a PDF of a popular book and this book has been a reference to many people because of the excellent content it has, that already shows the authority of the author of the book. I see it just right to give that book some higher authority for bibliography references.

Is the Author information really coming from the Metadata?



In Google Scholar, the Authors are placed above the description snippet.



After downloading that document, and checking the metadata, something is quite different this time. The Author and Title Metadata does not match the Author and Title on the document and Google seems to know the right Authors to place on the SERPs.

Incorrect PDF Metadata ignored in Google Scholar

What does this suggest? Similar to bibliography formats, the whole research paper is following a format of writing. If Google Scholar is all about scholarly research papers, then all follow general research paper writing formats. Looking at these patterns, Google seems to be able to pull out the necessary essential data parsing it into the required database fields it needs to find.

If the bibliographies are a new form of link building, is this exploitable?



If there are crazy link builders that build links like... well crazy. That would probably not exist here. All Google Scholar results come for research journals that has archive copies on the web. To get into Google Scholar, you really have to come up with some good scholarly quality work and try your best to get into research journals first.

Do I see this essential in real life SEO



Depending on who you are doing SEO for. A client that can be an authority in the science, engineering, technology and similar industries may benefit from this by leveraging their white papers on their technology. Creating highly informative research that will just be popular because of the quality content may serve well in the same way how link bait works.

Although I do not see myself suggesting to a client to come up with some high quality research paper especially if there is nothing to produce, but I would ask if they already have any currently published research in journals and try to leverage everything else from there.

Comments

Tag: , ,

Add to Del.icio.us | Digg | Reddit | Furl

Have a bookmark! -


About the Author:
Benj Arriola is the 2007 SEO World Champion and won 2nd Place in the 2008 UK Webmaster World SEO Contest. An insightful web expert who brings to BusinessOnLine his solid methodologies for using technology and the power of the community to solve tough SEO challenges.


Visit the SearchNewz Directory
Do you have a search site?
Submit it free to the internet's best search industry directory. » Click Here
Search Engines
Google, Yahoo, MSN...

Search Marketing
Marketing, Budget, Planning...

Pay Per Click
Bid, Price, Quality...
SEO Companies
Optimization, Manage, Company...

SEO Tools
Track, Search, Create...

Analytics
Statistics, Counter...
» Submit your site for FREE «

Latest News

Get Your Site Submitted for Free in the World's Largest B2B Directory!

Email Address:
* URL:
*
*Indicates Mandatory Field

Terms & Conditions



Titan Quest Forum Nintendo Wii Graphics Forum
Halo 3 Forum Mac Software

Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Archive SearchNewz.com Privacy Policy Legal Sitemap Contact Us RSS Feeds Newsletter Signup Subscribe to our feeds!