O'Reilly logo

Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition by Dan Ellis, Nelson Morgan, Ben Gold

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

CHAPTER 24

image

DETERMINISTIC SEQ UENC RECOGNITION FOR ASR

24.1 INTRODUCTION

In the past few chapters, we have established the basics for understanding the static pattern-classification aspect of speech recognition.

  1. Signal representation: in most ASR systems, some function of the local short-term spectrum is used. Typically, this consists of cepstral parameters corresponding to a smoothed spectrum. These parameters are computed every 10 ms or so from a Hamming-windowed speech segment that is 20–30 ms in length. Each of these temporal steps is referred to as a frame.
  2. Classes: in most current systems, the categories that are associated with the short-term signal spectra are phones or subphones,1 as noted in Chapter 23. In some systems, though, the classes simply consist of implicit categories associated with the training data.

Given these choices, one can use any of the techniques described in Chapter 8 to train deterministic classifiers (e.g., minimum distance, linear discriminant functions, neural networks, etc.) that can classify signal segments into one of the classes. However, as noted earlier, speech recognition includes both pattern classification and sequence recognition; recognition of a string of linguistic units from the sequence of segment spectra requires finding the best match overall, not just locally. This would not be so much of a problem if the local match was always ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required