Chapter 7. Speech Recognition

In this chapter, we will cover the following recipes:

  • Reading and plotting audio data
  • Transforming audio signals into the frequency domain
  • Generating audio signals with custom parameters
  • Synthesizing music
  • Extracting frequency domain features
  • Building Hidden Markov Models
  • Building a speech recognizer

Introduction

Speech recognition refers to the process of recognizing and understanding spoken language. Input comes in the form of audio data, and the speech recognizers will process this data to extract meaningful information from it. This has a lot of practical uses, such as voice controlled devices, transcription of spoken language into words, security systems, and so on.

Speech signals are very versatile in nature. There are ...

Get Python: Real World Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.