Benchmarking Current Indexing Status

The search engines have an enormous task: that of indexing the world’s online content—well, more or less. The reality is that they try hard to discover all of it, but they do not choose to include all of it in their indexes. There can be a variety of reasons for this, such as the page being inaccessible to the spider, being penalized, or not having enough link juice to merit inclusion.

When you launch a new site or add new sections to an existing site, or if you are dealing with a very large site, not every page will necessarily make it into the index. To get a handle on this you will want to actively track the indexing level of your site. If your site is not fully indexed, it could be a sign of a problem (not enough links, poor site structure, etc.).

Getting basic indexation data from search engines is pretty easy. All three major search engines support the same basic syntax for that: site:yourdomain.com. Figure 4-12 shows a sample of the output from Bing.

Indexing data from Bing

Figure 4-12. Indexing data from Bing

Keeping a log of the level of indexation over time can help you understand how things are progressing. This can take the form of a simple spreadsheet.

Related to indexation is the crawl rate of the site. Google provides this data in Google Webmaster Central. Figure 4-13 shows a screenshot representative of the crawl rate charts that are available (another chart, not shown here, displays the average time spent downloading a page on your site).

Crawl data from Google Webmaster Tools

Figure 4-13. Crawl data from Google Webmaster Tools

Short-term spikes are not a cause for concern, nor are periodic drops in levels of crawling. What is important is the general trend. In Figure 4-13, the crawl rate seems to be drifting upward. This bodes well for both rankings and indexation.

For the other search engines, the crawl-related data can then be revealed using logfile analyzers (see Auditing an Existing Site to Identify SEO Problems), and then a similar timeline can be created and monitored.

Get The Art of SEO, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.