Introduction

By now there’s no way to have missed the Internet. Over the past twelve years, it’s become a major factor in communications and business—arguably revolutionizing both. You can harness the Internet’s awesome power to find recipes for ginger maple fudge ice cream and send the best one to your grandmother on another continent. You can use it to pay bills, look up your high school sweetheart’s current phone number, and find out the common side effects for your allergy shots. You can do your grocery shopping, check your baseball team’s box scores, or auction off the Rollerblades you used only once.

In fact, you can do practically everything online these days, thanks to a Web that’s truly worldwide and growing by millions of pages per day. But all those goodies are useful only if you can find them—which would be no problem if the Internet had a complete index or a tidy table of contents…but it doesn’t. Not even close.

A Little History

The Internet is really just an ever-growing series of interconnected computer networks that has been around since the late 1960s. Initially, the people who roamed the Internet were military officials, scientists, computer programmers, and other geek types who could deal with what was then a user-hostile system. All you saw were endless lines of words, and everything you did required an arcane text command. There were no photos, and there was no clicking. Instead, there were a lot of dull amber monitors and a ton of typing.

By the early 1990s, people had begun developing not only useful systems that ran on the Internet, like email and newsgroups, but also programs that made them easier to use (think Outlook and Eudora). Among the nifty new Internet systems was the World Wide Web, a network of documents and databases that, when viewed through a software program called a Web browser, let people share information visually and use a mouse to navigate around it. Clicking turned out to be a pretty intuitive way for people to sift through information, and Web surfing was born.

But before the Internet could become commonplace in civilian life, people needed an easy way to find other people and things on the Internet. In 1994, researchers at Carnegie Mellon University introduced Lycos, the first Web-based search engine— a program to help you find stuff on the Web. And the people rejoiced.

Actually, the geeks rejoiced. Seeing tremendous opportunity in providing search services on the Web, a mess of technology companies sprung up and launched search sites. Some sites, like Yahoo, were directories that slotted Web pages into predetermined categories that people could then browse through. Others, like HotBot, AltaVista, Excite, and Infoseek, ran full-text search engines, programs that tracked the words on Web pages and then, when you searched for a word or phrase, sent you to the pages it knew contained them.

Unlike directories, which let people create categories and assign Web pages to them, full-text search engines use computers to record the words on Web pages. In fact, such search engines rely on two automated programs to track Web pages. First, spiders, also known as robots or crawlers, go out and methodically trawl the Web, downloading copies of pages as they go. Second, programs called indexers record the text on the downloaded pages, along with important information that’s encoded in them—things like the page title, links to other pages, and so forth. Indexers store all this stuff in a database, helpfully called an index or sometimes a catalog. To keep up to date, spiders return once a month or so to the pages they’ve visited, making note of dead links and handing off new or changed pages to the indexers for recording.

When you use a full-text search engine to look for something, it actually searches its index—not the entire live Web, which would be impossible.

Note

Google’s index currently has more than eight billion Web pages, according to the company.

By the late 1990s, both kinds of searches—directory and full-text—were wide-spread and user-friendly, and non-geeks began taking to the Internet like cats take to tuna fish.

How the Web Was Won: Google’s Technology

In the Web’s early days, full-text searches ranked their results according to information contained on Web sites themselves—like the prominence of a certain word. If, for example, you wanted to learn about buying a small dog and you searched for dachshund, your list of sites was likely to be organized by which ones had the most instances of the word “dachshund.” That might well have been a site set up by a woman in Boise who painted cartoon dogs onto sweatshirts, the schedule for a group of people in Sacramento who have dachshunds with ingrown toenails, and the Daytona Dachshunds Little League roster. You could search through thousands of pages before you hit any useful information.

Even if you narrowed your search to something like "dachshund breeders,” you might still have gotten sites run by pet food conglomerates or veterinarians or any company that set up its Web pages to draw people with an interest in dogs. In short, it was maddeningly hard to get relevant search results.

Enter Google. In 1995, Sergey Brin and Larry Page met in the graduate computer science program at Stanford University. Their idea was to create a search engine that would rank search results not on data that could be manipulated by Web masters, but by using the strength of the Internet itself—community input. Their technology evaluated a site primarily on how many other sites linked to it, and ranked search results accordingly. Thus, their searches tended to return results that lots of other people found useful, resulting in a surprisingly valuable system.

By 1998, Brin and Page had dropped out of Stanford to start Google. In its first year, the company—run by four employees out of a garage in Menlo Park, California—answered about 10,000 search requests per day. Today, the Web is home to about a dozen very popular search sites and likely thousands of less well-known ones, but Google’s computers handle more search requests than anyone else’s— over 250 million per day.

Google is the reigning search champ not because the company has clever marketing (it doesn’t) or a killer online dating service (again, no dice), but because the site is easy to use and effective.

Tip

Wonder what all those people are searching for? Google provides snapshots of its search activity, by month and by year, at Google Zeitgeist, www.google.com/press/zeitgeist.html. This is the perfect place to find out if anyone still cares about Martha Stewart or whether The Apprentice is declining in popularity.

How the Ranking Works

Google uses a number of elements to decide whether a Web page is a good match for a particular search. First, it looks at links. Links from one Web page to another don’t appear spontaneously; people have to make them—in effect saying, “Look here and here and here.” Because each link thus represents a decision, Google infers that a link from one page to another is tantamount to a vote for the second page. Pages with lots of votes are considered more important than other pages. For example, if a million baseball-fan Web sites all have links to MLB.com (home of Major League Baseball), Google’s logic is, “Hey, that’s an important site for people searching for the word baseball.”

In addition, Google ranks the pages that cast the votes, based on their own popularity, and gives more weight to the votes from heavily linked-to pages. Finally, Google uses this information to assign Web pages an appropriate PageRank—Google’s term for status—which it calculates on a scale from one to ten. (Section 8.4 explains PageRank further and tells you how to find it for any given site.)

Note

The term PageRank is actually based on the name of one of Google’s founders, Larry Page, not on the idea of Web pages.

But all that jazz would lead to nothing more than an interesting hierarchy of Web popularity if it didn’t take into account the words you’re searching for. So when you query Google, it combines PageRank with an additional system for matching text—which looks not only at the content on a first layer of pages, but at the content on pages linking to them—to produce a list of pages that is, more often than not, relevant.

In all, the Google equation, or algorithm, incorporates 500 million variables looking at everything from links to the position of your search terms on a page. And most searches run in much less than a second.

Because the site’s methods are so complex, it’s tough—though not impossible—to jigger a page in order to improve its rank in a Google search. (See Chapter 8 for more on getting Google to find your site.)

Comparing Google with Other Searches

Most of the time, you’ll probably decide which search site to use based on the relevance of its results. But these days, many search sites return similar results, which means you might want to make your choice based on factors like speed and site design. It’s akin to buying a car today: most automobiles will get you where you want to go, but they differ in reliability, smoothness, and style. Figure I-1 compares two site designs.

For many researchers, Google is the no-brainer choice for searching the Web because it’s fast, neat, smart, and fun (Figure I-2).

About This Book

If you’re the sort of person who hasn’t quite figured out how to get to the Google home page—and you’re too embarrassed to ask any 8-year-old of your acquaintance—this book will help you not only get there but start using Google like a pro, too. If, on the other hand, Google already feels like an extension of your brain, and you’ve considered tattooing the site’s colorful logo across your forehead, this book can elevate you to search guru, helping you exploit the little-known but powerful features of Google. For example:

  • Most people who use Google every day have no idea why the results include links named “Cached” or “Similar pages” (see Section 1.4.1.4 and Section 1.4.1.8).

  • Practically nobody knows what happens when you throw the term inurl into your search string—but it’s a very handy trick (see Section 2.6.4).

    Top: The Google home page.Bottom: The Yahoo home page. For many queries, different search sites will give you similar results because most search engines have adopted some variation of Google’s method of analyzing links. But Google’s unusually clean design makes it faster to load and easier to read than many other search sites.

    Figure I-1. Top: The Google home page. Bottom: The Yahoo home page. For many queries, different search sites will give you similar results because most search engines have adopted some variation of Google’s method of analyzing links. But Google’s unusually clean design makes it faster to load and easier to read than many other search sites.

    The Google home page changes from time to time. To commemorate holidays and oddball celebrations (like Michelangelo’s birthday), Google puts up special, themed logos, drawn by Dennis Hwang, a Web designer at the company. This is the logo Google used to celebrate the leap year on February 29, 2004. (The site still works the same, no matter the logo design.) The special logos are more popular than the Beatles, and you can see the whole back catalog of them at www.google.com/holidaylogos.html.

    Figure I-2. The Google home page changes from time to time. To commemorate holidays and oddball celebrations (like Michelangelo’s birthday), Google puts up special, themed logos, drawn by Dennis Hwang, a Web designer at the company. This is the logo Google used to celebrate the leap year on February 29, 2004. (The site still works the same, no matter the logo design.) The special logos are more popular than the Beatles, and you can see the whole back catalog of them at www.google.com/holidaylogos.html.

  • Froogle (Section 5.1) is a link right on Google’s home page, but a lot of people never click through to find out the nifty things it can do.

  • Google Answers (Section 4.2) is a wicked cool service and may be one of the site’s most underused features.

  • The Google Toolbar (Section 6.1) is a total lifesaver, and if you haven’t already installed it, you need it right away.

  • And a woefully small number of people know that you can easily make Google work as a calculator, a dictionary, a package tracker, and more (Section 1.6.1).

Really knowing your way around Google lets you search smarter and faster. And knowing when not to use Google and try something else is critical, too (Section 1.5).

Of course, the Google site has help pages. But frankly, explaining how to use the site, and use it well, is not where the company shines. This book is designed to be the manual you wish came along with such a great search site.

About the Outline

This book has four parts. The first three contain several chapters, and the last part has just one:

  • Part One, Searching with Google, is all about finding the diamonds of information on the Web, even in its dusty corners. These chapters help you craft powerful search queries, search in the right places for the things you want to find, and interpret your search results to make informed decisions about which links to follow. This section of the book takes you from giving up on a search after one or two tries to holding a bagful of tricks for successful quests.

  • Part Two, Google Tools, introduces you to the Google toolbar and then shows you how it can change your life, saving you tons of time and putting an array of search options at your fingertips. This section also covers buttons, little-known search systems, and other features that can help you search more efficiently. And it teaches you how to use a wireless gizmo to take Google everywhere you go.

  • Part Three, Google for Webmasters, shows you the best ways to help Google— and other people—find your site. It demystifies AdSense and AdWords, Google’s programs to help you make money and influence people. And it teaches you how to make great use of Google Analytics, the company’s new program for researching your own site.

  • Part Four, Gmail, covers Google’s excellent, free email system.

At the end of the book, the Appendix provides information on Web sites about Google.

The Very Basics

To use this book, and indeed to use Google, you need to know a few basics. This book assumes that you’re familiar with a few terms and concepts:

  • Clicking. This book gives you three kinds of instructions that require you to use the mouse that’s attached to your computer. To click means to point the arrow cursor at something on the screen and then—without moving the cursor at all—press and release the clicker button on the mouse (or your laptop track-pad). To double-click, of course, means to click twice in rapid succession, again without moving the cursor at all. And to drag means to move the cursor while pressing the button.

  • Menus. The menus are the words at the top of your browser: File, Edit, and so on. Click one to make a list of commands appear, as though they’re written on a window shade you’ve just pulled down.

    Some people click and release the mouse button to open a menu and then, after reading the menu command choices, click again on the one they want. Other people like to hold the mouse button down after the initial click on the menu title, drag down the list to the desired command, and only then release the mouse button. Either method works fine.

  • Keyboard shortcuts. Every time you take your hand off the keyboard to move to the mouse, you lose time and potentially disrupt your workflow. That’s why many experienced computer fans prefer to trigger menu commands and other features by pressing certain combinations on the keyboard. For example, in most word processors, you can press Ctrl+P on a PC (⌘-P on a Mac) to print. When you read an instruction like “press Ctrl+P,” start by pressing the Ctrl key, then, while it’s down, type the letter P, and finally release both keys.

About → These → Arrows

Throughout this book, and throughout the Missing Manual series, you’ll find sentences like this one: “Choose View → Explorer Bar → Search.” That’s shorthand for a much longer instruction that directs you to choose three nested commands in sequence, like this: “In your browser, you’ll find a menu item called View. Choose that. Inside the View menu is a choice called Explorer Bar; click it to open it. Inside that menu is yet another one called Search. Click to open it, too.” Figure I-3 shows you the menus this sequence opens.

In this book, arrow notations help to simplify folder and menu instructions. For example, “Choose View → Explorer Bar → Search” is a more compact way of saying, “From the View menu, choose Explorer Bar; from the submenu that then appears, choose Search,” as shown here.

Figure I-3. In this book, arrow notations help to simplify folder and menu instructions. For example, “Choose View → Explorer Bar → Search” is a more compact way of saying, “From the View menu, choose Explorer Bar; from the submenu that then appears, choose Search,” as shown here.

Similarly, this kind of arrow shorthand helps to simplify the business of opening nested folders, such as Favorites → Links → Search Engines.

Late-Breaking News

As this book was going to press, Google was announcing new features nearly every day. The company is growing fast, and it’s quickly expanding from a search service into a much broader Web-based information and communications company.

Now, in addition to helping you find news on Britney Spears and holiday gifts for your postal carrier, it also lets you talk with your friends (Talk), post content of various kinds (Google Base), analyze your Web site traffic (Google Analytics), and much, much more. Cool new services appear in Google daily, and the interface for Gmail and other core Google services change constantly. Google is becoming a major part of daily life and big business. (The fact that many Google features now require you to have a free account with the company is one sign of its shift from search tool to info-comm provider.) It’s exciting stuff.

But frankly, it’s maddening for book publishers trying to cover Google. Just as a book on the service comes off the press, Google updates the look of its homepage, making a bunch of figures slightly out of date. Or the company offers a new feature after the book is already on bookstore shelves.

The book in your hands is no exception. It covers nearly all of Google’s services, and it absolutely helps you get the most out of Google. But as you read through it, you may discover that Gmail has some new labeling feature that this book doesn’t cover—because the feature didn’t exist at the time of the writing. Or you may find that Google has a brand-new service not discussed in these pages (Google Base is one such service; it came out too late to be included in real detail in this version of the book). Those small omissions are a drag. But they won’t prevent you from gaining great Google skills with this Missing Manual. And if you can’t find something Google-related in this book that you want to learn about, let us know. We’ll add it to the next edition and consider putting it online in the meantime.

About MissingManuals.com

At www.missingmanuals.com, you’ll find news, articles, and updates to the books in this series.

But the Web site also offers corrections and updates to this book (to see them, click the book’s title, then click Errata). In fact, you’re invited and encouraged to submit such corrections and updates yourself. In an effort to keep the book as up to date and accurate as possible, each time we print more copies of this book, we’ll make any confirmed corrections you’ve suggested. We’ll also note such changes on the Web site, so that you can mark important corrections into your own copy of the book, if you like.

In the meantime, we’d love to hear your own suggestions for new books in the Missing Manual line. There’s a place for that on the Web site, too, as well as a place to sign up for free email notification of new titles in the series.

Safari® Enabled

When you see a Safari® Enabled icon on the cover of your favorite technology book, that means the book is available online through the O’Reilly Network Safari Bookshelf.

Safari offers a solution that’s better than e-books. It’s a virtual library that lets you easily search thousands of top tech books, cut and paste code samples, download chapters, and find quick answers when you need the most accurate, current information. Try it for free at http://safari.oreilly.com.

Get Google: The Missing Manual, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.