The problem of finding one string within another comes up quite often: Searching through files on disk, DNA searches, and even Google rely on strategies for efficiently searching through text. If you've ever used a word processor or text editor or even the editor used for writing code, you have at some stage or another performed a string search. You may know it as the
There are many string searching algorithms — and no doubt many more will be discovered over time — each with its own optimizations for handling specific types of data. Some algorithms work better for plain text, while others work better for text and/or patterns containing a lot of repetition, such as DNA fragments.
This chapter covers two algorithms for plain-text searching. We start with an obvious brute-force algorithm and move on to the more sophisticated Boyer-Moore. Each is described in detail, and then you will see how a relatively simple twist on the brute-force approach enables the Boyer-Moore algorithm to perform significantly faster.
After reading this chapter you should be able to do the following:
Describe and implement a brute-force string searching algorithm
Describe and implement the Boyer-Moore string searching algorithm
Understand the performance characteristics of each algorithm
Describe and implement a generic string match iterator
Describe and implement a simple file searching application
Because we want to be able to implement ...