O'Reilly logo

Greasemonkey Hacks by Mark Pilgrim

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hack #48. Remove Spammy Domains from Search Results

Fight back against search engine spammers who register domains with multiple "hot" keywords separated by hyphens.

Google and other search engines are engaged in an ongoing arms race against spammers, who use every conceivable trick to attain top placement for lucrative search keywords. One such trick is to register a domain name with the keywords themselves, such as buy-cheap-prescription-drugs-online.com. (I just made that up, although I Wouldn't be the slightest bit surprised if it already existed. In fact, I would be surprised if it didn't.) Recently, Google has cracked down on such techniques, but some spammy domains still show up in search results.

Think of the web sites you visit on a regular basis. I'll bet that none of them contains more than one hyphen. In fact, the only time I ever see multi-hyphen domain names is when a spammer is one step ahead of Google and manages to get his site listed in the results. (I don't buy cheap prescription drugs online, but I did need to refinance my home last year. Search engine results were so overwhelmed with spam, I almost broke down and used a phone book.)

The Code

This user script removes Google search results where the domain contains more than one hyphen. Once again, the bulk of the logic is contained in the XPath query. This is tricky for two reasons. First, we need to count the number of instances of a particular character in a string, and XPath doesn't have a native function to ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required