Turn the Web into the ultimate cross-referenced library.
Stefan Magdalinski of Whitelabel.org (http://www.whitelabel.org) created a bit of a stir with his WikiProxy, which added links to the BBC's news articles that pointed to pages in the online encyclopedia Wikipedia (http://www.wikipedia.org). The proxy worked by reading in a BBC page, extracting candidates for linking using specially tailored regular expressions, and then comparing these candidates to a list of phrases from the Wikipedia database.
This raises the possibility of extending this functionality beyond the BBC site. It's not feasible to proxy the entire Web (unless you're Google), but it sounds like a perfect task for a Greasemonkey script. One big problem: you need to check the term candidates against the Wikipedia database, which weighs in at a hefty 18 megabytes for the article titles alone.
Stefan, author of the original WikiProxy, has kindly agreed to make the Wikipedia term lookup accessible as a web service. This hack uses his web service to look up possible Wikipedia entries and adds links to the current page based on the keyword lookup.
This script contacts a central server on every page load, which presents a privacy risk.
This user script runs on all pages. It is quite complex, but it breaks down into five steps:
The first section defines several variables, including various versions of the Wikipedia icons to label the new links, ...