Defining Your Site’s Information Architecture

Whether you’re working with an established website or not, you should plan to research the desired site architecture (from an SEO perspective) at the start of your SEO project. This task can be divided into two major components: technology decisions and structural decisions.

Technology Decisions

As we outlined previously in this chapter, your technology choices can have a major impact on your SEO results. The following is an outline of the most important issues to address at the outset:

Dynamic URLs

Although Google now states that dynamic URLs are not a problem for the company, this is not entirely true, nor is it the case for the other search engines. Make sure your CMS does not end up rendering your pages on URLs with many convoluted parameters in them.

Session IDs or user IDs in the URL

It used to be very common for CMSs to track individual users surfing a site by adding a tracking code to the end of the URL. Although this worked well for this purpose, it was not good for search engines, because they saw each URL as a different page rather than variants of the same page. Make sure your CMS does not ever serve up session IDs. If you are not able to do this, make sure you use rel="canonical" on your URLs (what this is, and how to use it, is explained in Chapter 6).

Superfluous flags in the URL

Related to the preceding two items is the notion of extra junk being present in the URL. This probably does not bother Google, but it may bother the other search engines, and it interferes with the user experience for your site.

Links or content based in JavaScript, Java, or Flash

Search engines often cannot see links and content implemented using these technologies. Make sure the plan is to expose your links and content in simple HTML text.

Content behind forms (including pull-down lists)

Making content accessible only after the user has completed a form (such as a login) or made a selection from an improperly implemented pull-down list is a great way to hide content from the search engines. Do not use these techniques unless you want to hide your content!

Temporary (302) redirects

This is also a common problem in web server platforms and CMSs. The 302 redirect blocks a search engine from recognizing that you have permanently moved the content, and it can be very problematic for SEO as 302 redirects block the passing of PageRank. You need to make sure the default redirect your systems use is a 301, or understand how to configure it so that it becomes the default.

All of these are examples of basic technology choices that can adversely affect your chances for a successful SEO project. Do not be fooled into thinking that SEO issues are understood, let alone addressed, by all CMS vendors out there—unbelievably, many are still very far behind the SEO curve. It is also important to consider whether a “custom” CMS is truly needed when many CMS vendors are creating ever more SEO-friendly systems—often with much more flexibility for customization and a broader development base. There are also advantages to selecting a widely used CMS, including portability in the event that you choose to hire different developers at some point.

Also, do not assume that all web developers understand the SEO implications of what they develop. Learning about SEO is not a requirement to get a software engineering degree or become a web developer (in fact, almost no known college courses address SEO). It is up to you, the SEO expert, to educate the other team members on this issue as early as possible in the development process.

Structural Decisions

One of the most basic decisions to make about a website concerns internal linking and navigational structures, which are generally mapped out in a site architecture document. What pages are linked to from the home page? What pages are used as top-level categories that then lead site visitors to other related pages? Do pages that are relevant to each other link to each other? There are many, many aspects to determining a linking structure for a site, and it is a major usability issue because visitors make use of the links to surf around your website. For search engines, the navigation structure helps their crawlers determine what pages you consider the most important on your site, and it helps them establish the relevance of the pages on your site to specific topics.

Chapter 6 covers site architecture and structure in detail. This section will simply reference a number of key factors that you need to consider before launching into developing or modifying a website. The first step will be to obtain a current site architecture document for reference, or to build one out for a new site.

Target keywords

As we will discuss in Chapter 5, keyword research is a critical component of SEO. What search terms do people use when searching for products or services similar to yours? How do those terms match up with your site hierarchy? Ultimately, the logical structure of your pages should match up with the way users think about products and services like yours. Figure 4-2 shows how this is done on the Amazon site.

Example of a well thought out site hierarchy

Figure 4-2. Example of a well thought out site hierarchy

Cross-link relevant content

Linking between articles that cover related material can be very powerful. It helps the search engine ascertain with greater confidence how relevant a web page is to a particular topic. This can be extremely difficult to do well if you have a massive ecommerce site, but Amazon solves the problem very well, as shown in Figure 4-3.

Product cross-linking on Amazon

Figure 4-3. Product cross-linking on Amazon

The “Frequently Bought Together” and “What Do Customers Ultimately Buy After Viewing This Item?” sections are brilliant ways to group products into categories that establish the relevance of the page to certain topic areas, as well as to create links between relevant pages.

In the Amazon system, all of this is rendered on the page dynamically, so it requires little day-to-day effort on Amazon’s part. The “Customers Who Bought...” data is part of Amazon’s internal databases, and the “Tags Customers Associate...” data is provided directly by the users themselves.

Of course, your site may be quite different, but the lesson is the same. You want to plan on having a site architecture that will allow you to cross-link related items.

Use anchor text

Anchor text is one of the golden opportunities of internal linking. As an SEO practitioner, you need to have in your plan from the very beginning a way to use keyword-rich anchor text in your internal links. Avoid using text such as “More” or “Click here,” and make sure the technical and creative teams understand this. You also need to invest time in preparing an anchor text strategy for the site.

Use breadcrumb navigation

Breadcrumb navigation is a way to show the user where he is in the navigation hierarchy. Figure 4-4 shows an example from PetSmart.

Breadcrumb bar on PetSmart.com

Figure 4-4. Breadcrumb bar on PetSmart.com

This page is currently two levels down from the home page. Also, note how the anchor text in the breadcrumb is keyword-rich, as is the menu navigation on the left. This is helpful to both users and search engines.

Minimize link depth

Search engines (and users) look to the site architecture for clues as to what pages are most important. A key factor is how many clicks from the home page it takes to reach a page. A page that is only one click from the home page is clearly important. A page that is five clicks away is not nearly as influential. In fact, the search engine spider may never even find such a page (depending in part on the site’s link authority).

Standard SEO advice is to keep the site architecture as flat as possible, to minimize clicks from the home page to important content. Do not go off the deep end, though; too many links on a page are not good for search engines either (a standard recommendation is not to exceed 100 links from a web page; we will cover this in more detail in Chapter 6). The bottom line is that you need to plan out a site structure that is as flat as you can reasonably make it without compromising the user experience.

In this and the preceding sections, we outlined common structural decisions that you need to make prior to beginning your SEO project. There are other considerations, such as how you might be able to make your efforts scale across a very large site (thousands of pages or more). In such a situation, you cannot feasibly review every page one by one.

Get The Art of SEO, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.