Docbase Search in Perspective

Effective search isn’t only, or even mostly, a matter of choosing the “right” search engine. Use whatever comes to hand. Use multiple engines, even. It’s not the engines alone that deliver high-quality search results. What matters more is the instrumentation you build into the data sets that you index and into the filters that use the instrumentation to intelligently organize raw search results.

Don’t neglect field indexing. It’s a powerful technique that’s rarely applied. There’s a world of difference between a document that mentions IBM and a document that is of type PressRelease and whose subject company is IBM.

Remember that the metadata strategy outlined in Chapter 6 works for you in several ways. It enables the kinds of navigational tools we saw in Chapter 7, and it also enables the kinds of smart search results we’ve seen in this chapter. Of course <meta> tags are just one form of useful metadata. You could equally well keep the same information in an SQL database.

Even if you don’t store and use extra fields, don’t neglect the two most basic forms of docbase metadata: URLs and HTML document titles. Search engines always return these two elements. Apply a disciplined standard to both namespaces, and you’ll be able to do a much better than average job of organizing search results.

Get Practical Internet Groupware now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.