You are previewing Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition.
O'Reilly logo
Speech and Audio Signal Processing: Processing and Perception of Speech and Music, Second Edition

Book Description

When Speech and Audio Signal Processing published in 1999, it stood out from its competition in its breadth of coverage and its accessible, intutiont-based style. This book was aimed at individual students and engineers excited about the broad span of audio processing and curious to understand the available techniques. Since then, with the advent of the iPod in 2001, the field of digital audio and music has exploded, leading to a much greater interest in the technical aspects of audio processing.

This Second Edition will update and revise the original book to augment it with new material describing both the enabling technologies of digital music distribution (most significantly the MP3) and a range of exciting new research areas in automatic music content processing (such as automatic transcription, music similarity, etc.) that have emerged in the past five years, driven by the digital music revolution.

New chapter topics include:

  • Psychoacoustic Audio Coding, describing MP3 and related audio coding schemes based on psychoacoustic masking of quantization noise

  • Music Transcription, including automatically deriving notes, beats, and chords from music signals.

  • Music Information Retrieval, primarily focusing on audio-based genre classification, artist/style identification, and similarity estimation.

  • Audio Source Separation, including multi-microphone beamforming, blind source separation, and the perception-inspired techniques usually referred to as Computational Auditory Scene Analysis (CASA).

Table of Contents

  1. Cover Page
  2. Title Page
  3. Copyright
  4. Dedication
  5. Contents
  6. PREFACE TO THE 2011 EDITION
    1. 0.1 WHY WE CREATED A NEW EDITION
    2. 0.2 WHAT IS NEW
    3. 0.3 A FINAL THOUGHT
  7. CHAPTER 1: INTRODUCTION
    1. 1.1 WHY WE WROTE THIS BOOK
    2. 1.2 HOW TO USE THIS BOOK
    3. 1.3 A CONFESSION
    4. 1.4 ACKNOWLEDGMENTS
    5. BIBLIOGRAPHY
  8. PART I: HISTORICAL BACKGROUND
    1. CHAPTER 2: SYNTHETIC AUDIO: A BRIEF HISTORY
      1. 2.1 VON KEMPELEN
      2. 2.2 THE VODER
      3. 2.3 TEACHING THE OPERATOR TO MAKE THE VODER “TALK”
      4. 2.4 SPEECH SYNTHESIS AFTER THE VODER
      5. 2.5 MUSIC MACHINES
      6. 2.6 EXERCISES
      7. BIBLIOGRAPHY
    2. CHAPTER 3: SPEECH ANALYSIS AND SYNTHESIS OVERVIEW
      1. 3.1 BACKGROUND
      2. 3.2 VOICE-CODING CONCEPTS
      3. 3.3 HOMER DUDLEY (1898–1981)
      4. 3.4 EXERCISES
      5. 3.5 APPENDIX: HEARING OF THE FALL OF TROY
      6. BIBLIOGRAPHY
    3. CHAPTER 4: BRIEF HISTORY OF AUTOMATIC SPEECH RECOGNITION
      1. 4.1 RADIO REX
      2. 4.2 DIGIT RECOGNITION
      3. 4.3 SPEECH RECOGNITION IN THE 1950s
      4. 4.4 THE 1960s
      5. 4.5 1971–1976 ARPA PROJECT
      6. 4.6 ACHIEVED BY 1976
      7. 4.7 THE 1980s IN AUTOMATIC SPEECH RECOGNITION
      8. 4.8 MORE RECENT WORK
      9. 4.9 SOME LESSONS
      10. 4.10 EXERCISES
      11. BIBLIOGRAPHY
    4. CHAPTER 5: SPEECH-RECOGNITION OVERVIEW
      1. 5.1 WHY STUDY AUTOMATIC SPEECH RECOGNITION?
      2. 5.2 WHY IS AUTOMATIC SPEECH RECOGNITION HARD?
      3. 5.3 AUTOMATIC SPEECH RECOGNITION DIMENSIONS
      4. 5.4 COMPONENTS OF AUTOMATIC SPEECH RECOGNITION
      5. 5.5 FINAL COMMENTS
      6. 5.6 EXERCISES
      7. BIBLIOGRAPHY
  9. PART II: MATHEMATICAL BACKGROUND
    1. CHAPTER 6: DIGITAL SIGNAL PROCESSING
      1. 6.1 INTRODUCTION
      2. 6.2 THE TRANSFORM
      3. 6.3 INVERSE Z TRANSFORM
      4. 6.4 CONVOLUTION
      5. 6.5 SAMPLING
      6. 6.6 LINEAR DIFFERENCE EQUATIONS
      7. 6.7 FIRST-ORDER LINEAR DIFFERENCE EQUATIONS
      8. 6.8 RESONANCE
      9. 6.9 CONCLUDING COMMENTS
      10. 6.10 EXERCISES
      11. BIBLIOGRAPHY
    2. CHAPTER 7: DIGITAL FILTERS AND DISCRETE FOURIER TRANSFORM
      1. 7.1 INTRODUCTION
      2. 7.2 FILTERING CONCEPTS
      3. 7.3 TRANSFORMATIONS FOR DIGITAL FILTER DESIGN
      4. 7.4 DIGITAL FILTER DESIGN WITH BILINEAR TRANSFORMATION
      5. 7.5 THE DISCRETE FOURIER TRANSFORM
      6. 7.6 FAST FOURIER TRANSFORM METHODS
      7. 7.7 RELATION BETWEEN THE DFT AND DIGITAL FILTERS
      8. 7.8 EXERCISES
      9. BIBLIOGRAPHY
    3. CHAPTER 8: PATTERN CLASSIFICATION
      1. 8.1 INTRODUCTION
      2. 8.2 FEATURE EXTRACTION
      3. 8.3 PATTERN-CLASSIFICATION METHODS
      4. 8.4 SUPPORT VECTOR MACHINES
      5. 8.5 UNSUPERVISED CLUSTERING
      6. 8.6 CONCLUSIONS
      7. 8.7 EXERCISES
      8. 8.8 APPENDIX: MULTILAYER PERCEPTRON TRAINING
      9. BIBLIOGRAPHY
    4. CHAPTER 9: STATISTICAL PATTERN CLASSIFICATION
      1. 9.1 INTRODUCTION
      2. 9.2 A FEW DEFINITIONS
      3. 9.3 CLASS-RELATED PROBABILITY FUNCTIONS
      4. 9.4 MINIMUM ERROR CLASSIFICATION
      5. 9.5 LIKELIHOOD-BASED MAP CLASSIFICATION
      6. 9.6 APPROXIMATING A BAYES CLASSIFIER
      7. 9.7 STATISTICALLY BASED LINEAR DISCRIMINANTS
      8. 9.8 ITERATIVE TRAINING: THE EM ALGORITHM
      9. 9.9 EXERCISES
      10. BIBLIOGRAPHY
  10. PART III: ACOUSTICS
    1. CHAPTER 10: WAVE BASICS
      1. 10.1 INTRODUCTION
      2. 10.2 THE WAVE EQUATION FOR THE VIBRATING STRING
      3. 10.3 DISCRETE-TIME TRAVELING WAVES
      4. 10.4 BOUNDARY CONDITIONS AND DISCRETE TRAVELING WAVES
      5. 10.5 STANDING WAVES
      6. 10.6 DISCRETE-TIME MODELS OF ACOUSTIC TUBES
      7. 10.7 ACOUSTIC TUBE RESONANCES
      8. 10.8 RELATION OF ACOUSTIC TUBE RESONANCES TO OBSERVED FORMANT FREQUENCIES
      9. 10.9 EXERCISES
      10. BIBLIOGRAPHY
    2. CHAPTER 11: ACOUSTIC TUBE MODELING OF SPEECH PRODUCTION
      1. 11.1 INTRODUCTION
      2. 11.2 ACOUSTIC TUBE MODELS OF ENGLISH PHONEMES
      3. 11.3 EXCITATION MECHANISMS IN SPEECH PRODUCTION
      4. 11.4 EXERCISES
      5. BIBLIOGRAPHY
    3. CHAPTER 12: MUSICAL INSTRUMENT ACOUSTICS
      1. 12.1 INTRODUCTION
      2. 12.2 SEQUENCE OF STEPS IN A PLUCKED OR BOWED STRING INSTRUMENT
      3. 12.3 VIBRATIONS OF THE BOWED STRING
      4. 12.4 FREQUENCY-RESPONSE MEASUREMENTS OF THE BRIDGE OF A VIOLIN
      5. 12.5 VIBRATIONS OF THE BODY OF STRING INSTRUMENTS: MEASUREMENT METHODS
      6. 12.6 RADIATION PATTERN OF BOWED STRING INSTRUMENTS
      7. 12.7 SOME CONSIDERATIONS IN PIANO DESIGN
      8. 12.8 BRIEF DISCUSSION OF THE TRUMPET, TROMBONE FRENCH HORN, AND TUBA
      9. 12.9 EXERCISES
      10. BIBLIOGRAPHY
    4. CHAPTER 13: ROOM ACOUSTICS
      1. 13.1 INTRODUCTION
      2. 13.2 SOUND WAVES
      3. 13.3 SOUND WAVES IN ROOMS
      4. 13.4 ROOM ACOUSTICS AS A COMPONENT IN SPEECH SYSTEMS
      5. 13.5 EXERCISES
      6. BIBLIOGRAPHY
  11. PART IV: AUDITORY PERCEPTION
    1. CHAPTER 14: EAR PHYSIOLOGY
      1. 14.1 INTRODUCTION
      2. 14.2 ANATOMICAL PATHWAYS FROM THE EAR TO THE PERCEPTION OF SOUND
      3. 14.3 THE PERIPHERAL AUDITORY SYSTEM
      4. 14.4 HAIR CELL AND AUDITORY NERVE FUNCTIONS
      5. 14.5 PROPERTIES OF THE AUDITORY NERVE
      6. 14.6 SUMMARY AND BLOCK DIAGRAM OF THE PERIPHERAL AUDITORY SYSTEM
      7. 14.7 EXERCISES
      8. BIBLIOGRAPHY
    2. CHAPTER 15: PSYCHOACOUSTICS
      1. 15.1 INTRODUCTION
      2. 15.2 SOUND-PRESSURE LEVEL AND LOUDNESS
      3. 15.3 FREQUENCY ANALYSIS AND CRITICAL BANDS
      4. 15.4 MASKING
      5. 15.5 SUMMARY
      6. 15.6 EXERCISES
      7. BIBLIOGRAPHY
    3. CHAPTER 16: MODELS OF PITCH PERCEPTION
      1. 16.1 INTRODUCTION
      2. 16.2 HISTORICAL REVIEW OF PITCH-PERCEPTION MODELS
      3. 16.3 PHYSIOLOGICAL EXPLORATION OF PLACE VERSUS PERIODICITY
      4. 16.4 RESULTS FROM PSYCHOACOUSTIC TESTING AND MODELS
      5. 16.5 SUMMARY
      6. 16.6 EXERCISES
      7. BIBLIOGRAPHY
    4. CHAPTER 17: SPEECH PERCEPTION
      1. 17.1 INTRODUCTION
      2. 17.2 VOWEL PERCEPTION: PSYCHOACOUSTICS AND PHYSIOLOGY
      3. 17.3 THE CONFUSION MATRIX
      4. 17.4 PERCEPTUAL CUES FOR PLOSIVES
      5. 17.5 PHYSIOLOGICAL STUDIES OF TWO VOICED PLOSIVES
      6. 17.6 MOTOR THEORIES OF SPEECH PERCEPTION
      7. 17.7 NEURAL FIRING PATTERNS FOR CONNECTED SPEECH STIMULI
      8. 17.8 CONCLUDING THOUGHTS
      9. 17.9 EXERCISES
      10. BIBLIOGRAPHY
    5. CHAPTER 18: HUMAN SPEECH RECOGNITION
      1. 18.1 INTRODUCTION
      2. 18.2 THE ARTICULATION INDEX AND HUMAN RECOGNITION
      3. 18.3 COMPARISONS BETWEEN HUMAN AND MACHINE SPEECH RECOGNIZERS
      4. 18.4 CONCLUDING THOUGHTS
      5. 18.5 EXERCISES
      6. BIBLIOGRAPHY
  12. PART V: SPEECH FEATURES
    1. CHAPTER 19: THE AUDITORY SYSTEM AS A FILTER BANK
      1. 19.1 INTRODUCTION
      2. 19.2 REVIEW OF FLETCHER'S CRITICAL BAND EXPERIMENTS
      3. 19.3 RELATION BETWEEN THRESHOLD MEASUREMENTS AND HYPOTHESIZED FILTER SHAPES
      4. 19.4 GAMMA-TONE FILTERS, ROEX FILTERS, AND AUDITORY MODELS
      5. 19.5 OTHER CONSIDERATIONS IN FILTER-BANK DESIGN
      6. 19.6 SPEECH SPECTRUM ANALYSIS USING THE FFT
      7. 19.7 CONCLUSIONS
      8. 19.8 EXERCISES
      9. BIBLIOGRAPHY
    2. CHAPTER 20: THE CEPSTRUM AS A SPECTRAL ANALYZER
      1. 20.1 INTRODUCTION
      2. 20.2 A HISTORICAL NOTE
      3. 20.3 THE REAL CEPSTRUM
      4. 20.4 THE COMPLEX CEPSTRUM
      5. 20.5 APPLICATION OF CEPSTRAL ANALYSIS TO SPEECH SIGNALS
      6. 20.6 CONCLUDING THOUGHTS
      7. 20.7 EXERCISES
      8. BIBLIOGRAPHY
    3. CHAPTER 21: LINEAR PREDICTION
      1. 21.1 INTRODUCTION
      2. 21.2 THE PREDICTIVE MODEL
      3. 21.3 PROPERTIES OF THE REPRESENTATION
      4. 21.4 GETTING THE COEFFICIENTS
      5. 21.5 RELATED REPRESENTATIONS
      6. 21.6 CONCLUDING DISCUSSION
      7. 21.7 EXERCISES
      8. BIBLIOGRAPHY
  13. PART VI: AUTOMATIC SPEECH RECOGNITION
    1. CHAPTER 22: FEATURE EXTRACTION FOR ASR
      1. 22.1 INTRODUCTION
      2. 22.2 COMMON FEATURE VECTORS
      3. 22.3 DYNAMIC FEATURES
      4. 22.4 STRATEGIES FOR ROBUSTNESS
      5. 22.5 AUDITORY MODELS
      6. 22.6 MULTICHANNEL INPUT
      7. 22.7 DISCRIMINANT FEATURES
      8. 22.8 DISCUSSION
      9. 22.9 EXERCISES
      10. BIBLIOGRAPHY
    2. CHAPTER 23: LINGUISTIC CATEGORIES FOR SPEECH RECOGNITION
      1. 23.1 INTRODUCTION
      2. 23.2 PHONES AND PHONEMES
      3. 23.3 PHONETIC AND PHONEMIC ALPHABETS
      4. 23.4 ARTICULATORY FEATURES
      5. 23.5 SUBWORD UNITS AS CATEGORIES FOR ASR
      6. 23.6 PHONOLOGICAL MODELS FOR ASR
      7. 23.7 CONTEXT-DEPENDENT PHONES
      8. 23.8 OTHER SUBWORD UNITS
      9. 23.9 PHRASES
      10. 23.10 SOME ISSUES IN PHONOLOGICAL MODELING
      11. 23.11 EXERCISES
      12. BIBLIOGRAPHY
    3. CHAPTER 24: DETERMINISTIC SEQ UENC RECOGNITION FOR ASR
      1. 24.1 INTRODUCTION
      2. 24.2 ISOLATED WORD RECOGNITION
      3. 24.3 CONNECTED WORD RECOGNITION
      4. 24.4 SEGMENTAL APPROACHES
      5. 24.5 DISCUSSION
      6. 24.6 EXERCISES
      7. BIBLIOGRAPHY
    4. CHAPTER 25: STATISTICAL SEQUENCE RECOGNITION
      1. 25.1 INTRODUCTION
      2. 25.2 STATING THE PROBLEM
      3. 25.3 PARAMETERIZATION AND PROBABILITY ESTIMATION
      4. 25.4 CONCLUSION
      5. 25.5 EXERCISES
      6. BIBLIOGRAPHY
    5. CHAPTER 26: STATISTICAL MODEL TRAINING
      1. 26.1 INTRODUCTION
      2. 26.2 HMM TRAINING
      3. 26.3 FORWARD-BACKWARD TRAINING
      4. 26.4 OPTIMAL PARAMETERS FOR EMISSION PROBABILITY ESTIMATORS
      5. 26.5 VITERBI TRAINING
      6. 26.6 LOCAL ACOUSTIC PROBABILITY ESTIMATORS FOR ASR
      7. 26.7 INITIALIZATION
      8. 26.8 SMOOTHING
      9. 26.9 CONCLUSIONS
      10. 26.10 EXERCISES
      11. BIBLIOGRAPHY
    6. CHAPTER 27: DISCRIMINANT ACOUSTIC PROBABILITY ESTIMATION
      1. 27.1 INTRODUCTION
      2. DISCRIMINANT TRAINING
      3. 27.3 HMM–ANN BASED ASR
      4. 27.4 OTHER APPLICATIONS OF ANNs TO ASR
      5. 27.5 EXERCISES
      6. 27.6 APPENDIX: POSTERIOR PROBABILITY PROOF
      7. BIBLIOGRAPHY
    7. CHAPTER 28: ACOUSTIC MODEL TRAINING: FURTHER TOPICS
      1. 28.1 INTRODUCTION
      2. 28.2 ADAPTATION
      3. 28.3 LATTICE-BASED MMI AND MPE
      4. 28.4 CONCLUSION
      5. 28.5 EXERCISES
      6. BIBLIOGRAPHY
    8. CHAPTER 29: SPEECH RECOGNITION AND UNDERSTANDING
      1. 29.1 INTRODUCTION
      2. 29.2 PHONOLOGICAL MODELS
      3. 29.3 LANGUAGE MODELS
      4. 29.4 DECODING WITH ACOUSTIC AND LANGUAGE MODELS
      5. 29.5 A COMPLETE SYSTEM
      6. 29.6 ACCEPTING REALISTIC INPUT
      7. 29.7 CONCLUDING COMMENTS
      8. BIBLIOGRAPHY
  14. PART VII: SYNTHESIS AND CODING
    1. CHAPTER 30: SPEECH SYNTHESIS
      1. 30.1 INTRODUCTION
      2. 30.2 CONCATENATIVE METHODS
      3. 30.3 STATISTICAL PARAMETRIC METHODS
      4. 30.4 A HISTORICAL PERSPECTIVE
      5. 30.5 SPECULATION
      6. 30.6 TOOLS AND EVALUATION
      7. 30.7 EXERCISES
      8. 30.8 APPENDIX: SYNTHESIZER EXAMPLES
      9. BIBLIOGRAPHY
    2. CHAPTER 31: PITCH DETECTION
      1. 31.1 INTRODUCTION
      2. 31.2 A NOTE ON NOMENCLATURE
      3. 31.3 PITCH DETECTION, PERCEPTION AND ARTICULATION
      4. 31.4 THE VOICING DECISION
      5. 31.5 SOME DIFFICULTIES IN PITCH DETECTION
      6. 31.6 SIGNAL PROCESSING TO IMPROVE PITCH DETECTION
      7. 31.7 PATTERN-RECOGNITION METHODS FOR PITCH DETECTION
      8. 31.8 SMOOTHING TO FIX ERRORS IN PITCH ESTIMATION
      9. 31.9 NORMALIZING THE AUTOCORRELATION FUNCTION
      10. 31.10 EXERCISES
      11. BIBLIOGRAPHY
    3. CHAPTER 32: VOCODERS
      1. 32.1 INTRODUCTION
      2. 32.2 STANDARDS FOR DIGITAL SPEECH CODING
      3. 32.3 DESIGN CONSIDERATIONS IN CHANNEL VOCODER FILTER BANKS
      4. 32.4 ENERGY MEASUREMENTS IN A CHANNEL VOCODER
      5. 32.5 A VOCODER DESIGN FOR SPECTRAL ENVELOPE ESTIMATION
      6. 32.6 BIT SAVING IN CHANNEL VOCODERS
      7. 32.7 DESIGN OF THE EXCITATION PARAMETERS FOR A CHANNEL VOCODER
      8. 32.8 LPC VOCODERS
      9. 32.9 CEPSTRAL VOCODERS
      10. 32.10 DESIGN COMPARISONS
      11. 32.11 VOCODER STANDARDIZATION
      12. 32.12 EXERCISES
      13. BIBLIOGRAPHY
    4. CHAPTER 33: LOW-RATE VOCODERS
      1. 33.1 INTRODUCTION
      2. 33.2 THE FRAME-FILL CONCEPT
      3. 33.3 PATTERN MATCHING OR VECTOR QUANTIZATION
      4. 33.4 THE KANG–COULTER 600-BPS VOCODER
      5. 33.5 SEGMENTATION METHODS FOR BANDWIDTH REDUCTION
      6. 33.6 EXERCISES
      7. BIBLIOGRAPHY
    5. CHAPTER 34: MEDIUM-RATE AND HIGI RAT VOCODERS
      1. 34.1 INTRODUCTION
      2. 34.2 VOICE EXCITATION AND SPECTRAL FLATTENING
      3. 34.3 VOICE-EXCITED CHANNEL VOCODER
      4. 34.4 VOICE-EXCITED AND ERROR-SIGNAL-EXCITED LPC VOCODERS
      5. 34.5 WAVEFORM CODING WITH PREDICTIVE METHODS
      6. 34.6 ADAPTIVE PREDICTIVE CODING OF SPEECH
      7. 34.7 SUBBAND CODING
      8. 34.8 MULTIPULSE LPC VOCODERS
      9. 34.9 CODE-EXCITED LINEAR PREDICTIVE CODING
      10. 34.10 REDUCING CODEBOOK SEARCH TIME IN CELP
      11. 34.11 CONCLUSIONS
      12. 34.12 EXERCISES
      13. BIBLIOGRAPHY
    6. CHAPTER 35: PERCEPTUAL AUDIO CODING
      1. 35.1 TRANSPARENT AUDIO CODING
      2. 35.2 PERCEPTUAL MASKING
      3. 35.3 NOISE SHAPING
      4. 35.4 SOME EXAMPLE CODING SCHEMES
      5. 35.5 SUMMARY
      6. 35.6 EXERCISES
      7. BIBLIOGRAPHY
  15. PART VIII: OTHER APPLICATIONS
    1. CHAPTER 36: SOME ASPECTS OF COMPUTER MUSIC SYNTHESIS
      1. 36.1 INTRODUCTION
      2. 36.2 SOME EXAMPLES OF ACOUSTICALLY GENERATED MUSICALS SOUNDS
      3. 36.3 MUSIC SYNTHESIS CONCEPTS
      4. 36.4 ANALYSIS-BASED SYNTHESIS
      5. 36.5 OTHER TECHNIQUES FOR MUSIC SYNTHESIS
      6. 36.6 REVERBERATION
      7. 36.7 SEVERAL EXAMPLES OF SYNTHESIS
      8. 36.8 EXERCISES
      9. ACKNOWLEDGMENT
      10. BIBLIOGRAPHY
    2. CHAPTER 37: MUSIC SIGNAL ANALYSIS
      1. 37.1 THE INFORMATION IN MUSIC AUDIO
      2. 37.2 MUSIC TRANSCRIPTION
      3. 37.3 NOTE TRANSCRIPTION
      4. 37.4 SCORE ALIGNMENT
      5. 37.5 CHORD TRANSCRIPTION
      6. 37.6 STRUCTURE DETECTION
      7. 37.7 CONCLUSION
      8. 37.8 EXERCISES
      9. BIBLIOGRAPHY
    3. CHAPTER 38: MUSIC RETRIEVAL
      1. 38.1 THE MUSIC RETRIEVAL PROBLEM
      2. 38.2 MUSIC FINGERPRINTING
      3. 38.3 QUERY BY HUMMING
      4. 38.4 COVER SONG MATCHING
      5. 38.5 MUSIC CLASSIFICATION AND AUTOTAGGING
      6. 38.6 MUSIC SIMILARITY
      7. 38.7 CONCLUSIONS
      8. 38.8 EXERCISES
      9. BIBLIOGRAPHY
    4. CHAPTER 39: SOURCE SEPARATION
      1. 39.1 SOURCES AND MIXTURES
      2. 39.2 EVALUATING SOURCE SEPARATION
      3. 39.3 MULTI-CHANNEL APPROACHES
      4. 39.4 BEAMFORMING WITH MICROPHONE ARRAYS
      5. 39.5 INDEPENDENT COMPONENT ANALYSIS
      6. 39.6 COMPUTATIONAL AUDITORY SCENE ANALYSIS
      7. 39.7 MODEL-BASED SEPARATION
      8. 39.8 CONCLUSIONS
      9. 39.9 EXERCISES
      10. BIBLIOGRAPHY
    5. CHAPTER 40: SPEECH TRANSFORMATIONS
      1. 40.1 INTRODUCTION
      2. 40.2 TIME-SCALE MODIFICATION
      3. 40.3 TRANSFORMATION WITHOUT EXPLICIT PITCH DETECTION
      4. 40.4 TRANSFORMATIONS IN ANALYSIS-SYNTHESIS SYSTEMS
      5. 40.5 SPEECH MODIFICATIONS IN THE PHASE VOCODER
      6. 40.6 SPEECH TRANSFORMATIONS WITHOUT PITCH EXTRACTION
      7. 40.7 THE SINE TRANSFORM CODER AS A TRANSFORMATION ALGORITHM
      8. 40.8 VOICE MODIFICATION TO EMULATE A TARGET VOICE
      9. 40.9 EXERCISES
      10. BIBLIOGRAPHY
    6. CHAPTER 41: SPEAKER VERIFICATION
      1. 41.1 INTRODUCTION
      2. 41.2 GENERAL DESIGN OF A SPEAKER RECOGNITION SYSTEM
      3. 41.3 EXAMPLE SYSTEM COMPONENTS
      4. 41.4 EVALUATION
      5. 41.5 MODERN RESEARCH CHALLENGES
      6. 41.6 EXERCISES
      7. BIBLIOGRAPHY
    7. CHAPTER 42: SPEAKER DIARIZATION
      1. 42.1 INTRODUCTION
      2. 42.2 GENERAL DESIGN OF A SPEAKER DIARIZATION SYSTEM
      3. 42.3 EXAMPLE SYSTEM COMPONENTS
      4. 42.4 RESEARCH CHALLENGES
      5. 42.5 EXERCISES
      6. BIBLIOGRAPHY
  16. INDEX