More Searchable Content and Content Types

The emphasis throughout this book has been on providing the crawlers with textual content semantically marked up using HTML. However, the less accessible document types—such as multimedia, content behind forms, and scanned historical documents—are being integrated into the search engine results pages (SERPs) more and more, as search algorithms evolve in the ways that the data is collected, parsed, and interpreted. Greater demand, availability, and usage also fuel the trend.

Engines Will Make Crawling Improvements

The search engines are breaking down some of the traditional limitations on crawling. Content types that search engines could not previously crawl or interpret are being addressed. For example, in November 2011, Google announced that it had increased its capability to execute JavaScript, discover content embedded in AJAX, and process forms.

In June 2009, Google announced that it had improved the crawling and indexing of Flash content (http://googlewebmastercentral.blogspot.com/2009/06/flash-indexing-with-external-resource.html). In particular, this announcement indicated that Google was now able to load content within Flash that was accessed by external JavaScript calls, which is an implementation method that many Flash-based systems use.

Perhaps the bigger problem with Flash is the fact that it is inherently nontextual. It is essentially like any other video format: there is little incentive within the medium to use lots of text, and ...

Get The Art of SEO, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.