A Search Engine’s Web API

We’ll talk about three different search engines in this chapter—Excite, SWISH-E, and Microsoft Index Server. Each packages its functionality in a different way. The Excite engine is command-line driven but wrapped in a layer of Perl that is written by a web application that you use to customize your web interface to Excite. SWISH-E is a plain command-line program. To integrate it into your site, you have to script your own wrapper around it or use one of the canned wrappers available for it. Microsoft Index Server, unlike the other two, has no command-line interface. It’s a Dynamic-Link Library (DLL) that works closely with Internet Information Server. To customize its web interface, you write proprietary scripts and templates.

Once integrated into a site, though, these engines are more alike than different. And, crucially for our purposes here, they’re all plug-compatible with one another. How can that be? Indexing strategies and query languages aside, every search engine returns a set of URLs in response to a query. One way or another, you are guaranteed to be able to intercept and process that set of URLs. In some cases, you can wrap your own scripts around the engine itself or around the default scripts that come with it. If that’s not possible, because (as with Microsoft Index Server) the engine runs as a deeply intertwined extension of a web server, you can still wrap it using a web-client script.

Web-Client Scripting

You can write web-client ...

Get Practical Internet Groupware now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.