INDEXING

Once a search engine bot has crawled a page, it attempts to store the contents of that page in the search engine index. Common reasons a search engine may not store the contents of a page include:

  • Redirects are incorrectly implemented. Most of the time, when you move content, you are moving it permanently. In those cases, implement redirects using the 301 HTTP status code. This code signifies that the change is permanent. Commonly, redirects default to the 302 HTTP status code, which signifies that the change is temporary. From a search engine perspective, this difference is important because if you’ve moved content (for instance, when changing domains or URL structure), you want the search engines to replace the old URLs with the new ones in the index. If possible, avoid JavaScript and meta refresh redirects.
  • The content is locked behind registration. If you require registration to view your content, search engine bots can’t access it. A number of options exist for balancing search acquisition needs and registration requirements. For instance, you can provide abstracts of content outside of registration or you can participate in Google’s First Click Free program.13 This is discussed further at marketingintheageofgoogle.com.
  • The content is hidden in Flash or Silverlight. Search engines have gotten better at crawling Flash pages, but a number of problems remain. You can find a list of resources on making Flash accessible to search engines at marketingintheageofgoogle.com ...

Get Marketing in the Age of Google: Your Online Strategy IS Your Business Strategy, Revised and Updated now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.