Hidden Content

In Content Delivery and Search Spider Control in Chapter 6, we discussed ways that you can hide content from the search engines when you want to. However, at times this is done unintentionally—that is, sometimes publishers produce great content and then, for one reason or another, fail to expose that content to search engines.

Valuable content can be inadvertently hidden from the search engines, and occasionally, the engines can find hidden content and construe it as spam, whether that was your intent or not.

Identifying Content That Engines Don’t See

How do you determine when this is happening? Sometimes the situation is readily apparent; for example, if you have a site that receives high traffic volume and then your developer accidentally NoIndexes every page on the site, you will begin to see a catastrophic drop in traffic. Most likely this will set off a panic investigation, leading to identifying the NoIndex issue as the culprit.

Does this really happen? Unfortunately, it does. Here is an example scenario. Suppose you work on site updates on a staging server. Because you don’t want the search engines to discover this duplicate version of your site, you keep the pages on the staging server NoIndexed. Normally, when you move the site from the staging server to the live server, you remove the NoIndex tags. However, one day you forget to remove the tags. This is just normal human error in action.

This type of problem can also emerge in another scenario. Some webmasters ...

Get The Art of SEO, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.