15.1 INTRODUCTION

String matching is employed in several applications, such as packet classification, computational biology, spam blocking, and information retrieval. String search operates on a given alphabet set Σ of size |Σ|, a pattern P = p0p1pm−1 of length m, and a text string T = t0t1tn−1 of length n, with mn. The problem is to find all occurrences of the pattern P in the text string T. The average time complexity for implementing the string search problem on a single processor was proven to be O(n) [99]. We refer the reader to Reference 100 for a comprehensive review of the different hardware implementations of the string matching problem.

A hardware implementation for the search engine can be assumed to have the following characteristics:

  • The text length n is typically big and variable.
  • The pattern length m varies from a word of few characters to hundreds of characters (e.g., a URL address).
  • The word length w is determined by the data storage organization and datapath bus width.
  • Typically, the search engine is looking for the existence of the pattern P in the text T; that is, the search engine only locates the first occurrence of the P in T.
  • The text string T is supplied to the hardware in word serial format.

Get Algorithms and Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.