This landmark book offers a balanced discussion of both the mathematical theory of digital speech signal processing and critical contemporary applications. Authors John R. Deller, John H. L. Hansen and John G. Proakis provide a comprehensive view of all major modern speech processing areas: speech production physiology and modeling, signal analysis techniques, coding, enhancement, quality assessment, and recognition. You will learn the principles needed to understand advanced technologies in speech processing, from speech coding for communications systems to biomedical applications of speech analysis and recognition. Ideal for self-study or as a course text, this far-reaching reference book offers an extensive historical context for concepts under discussion, end-of-chapter problems, and practical algorithms. Table of Contents Preface to the IEEE Edition Preface Acronyms and Abbreviations Signal Processing Background Propaedeutic Preamble - The Purpose of Chapter 1
- Please Read This Note on Notation
- For People Who Never Read Chapter 1 (and Those Who Do)
Review of DSP Concepts and Notation - "Normalized Time and Frequency"
- Singularity Signals
- Energy and Power Signals
- Transforms and a Few Related Concepts
- Windows and Frames
- Discrete-Time Systems
- Minimum, Maximum, and Mixed-Phase Signals and Systems
Review of Probability and Stochastic Processes - Probability Spaces
- Random Variables
- Random Processes
- Vector-Valued Random Processes
Topics in Statistical Pattern Recognition - Distance Measures
- The Euclidean Metric and "Prewhitening" of Features
- Maximum Likelihood Classification
- Feature Selection and Probablistic Separability Measures
- Clustering Algorithms
Information and Entropy - Definitions
- Random Sources
- Entropy Concepts in Pattern Recognition
Phasors and Steady-State Solutions Onward to Speech Processing Problems Appendices: Supplemental Bibliography Example Textbooks on Digital Signal Processing Example Textbooks on Stochastic Processes Example Textbooks on Statistical Pattern Recognition Example Textbooks on Information Theory Other Resources on Speech Processing - Textbooks
- Edited Paper Collections
- Journals
- Conference Proceedings
Example Textbooks on Speech and Hearing Sciences Other Resources on Artificial Neural Networks - Textbooks and Monographs
- Journals
- Conference Proceedings
Speech Production and Modeling Fundamentals of Speech Science Preamble Speech Communication Anatomy and Physiology of the Speech Production System - Anatomy
- The Role of the Vocal Tract and Some Elementary Acoustical Analysis
- Excitation of the Speech System and the Physiology of Voicing
Phonemics and Phonetics - Phonemes Versus Phones
- Phonemic and Phonetic Transcription
- Phonemic and Phonetic Classification
- Prosodic Features and Coarticulation
Conclusions Problems Modeling Speech Production Preamble Acoustic Theory of Speech Production - History
- Sound Propagation
- Source Excitation Model
- Vocal-Tract Modeling
- Models for Nasals and Fricatives
Discrete-Time Modeling - General Discrete-Time Speech Model
- A Discrete-Time Filter Model for Speech Production
- Other Speech Models
Conclusions Problems Single Lossless Tube Analysis - Open and Closed Terminations
- Impedance Analysis, T-Network, and Two-Port Network
Two-Tube Lossless Model of the Vocal Tract Fast Discrete-Time Transfer Function Calculation Analysis Techniques Short-Term Processing of Speech Introduction Short-Term Measures from Long-Term Concepts - Motivation
- "Frames" of Speech
- Approach 1 to the Derivation of a Short-Term Feature and Its Two Computational Forms
- Approach 2 to the Derivation of a Short-Term Feature and Its Two Computational Forms
- On the Role of "1/N" and Related Issues
Example Short-Term Features and Applications - Short-Term Estimates of Autocorrelation
- Average Magnitude Difference Function
- Zero Crossing Measure
- Short-Term Power and Energy Measures
- Short-Term Fourier Analysis
Conclusions Problems Linear Prediction Analysis Preamble Long-Term LP Analysis by System Identification - The All-Pole Model
- Identification of the Model
How Good Is the LP Model? - The "Ideal" and "Almost Ideal" Cases
- "Nonideal" Cases
- Summary and Further Discussion
Short-Term LP Analysis - Autocorrelation Method
- Covariance Method
- Solution Methods
- Gain Computation
- A Distance Measure for LP Coefficients
- Preemphasis of the Speech Waveform
Alternative Representations of the LP Coefficients - The Line Spectrum Pair
- Cepstral Parameters
Applications of LP in Speech Analysis - Pitch Estimation
- Formant Estimation and Glottal Waveform Deconvolution
Conclusions Problems Proof of Theorem 5.1 The Orthogonality Principle Cepstral Analysis Introduction "Real" Cepstrum - Long-Term Real Cepstrum
- Short-Term Real Cepstrum
- Example Applications of the stRC to Speech Analysis and Recognition
- Other Forms and Variations on the stRC Parameters
Complex Cepstrum - Long-Term Complex Cepstrum
- Short-Term Complex Cepstrum
- Example Application of the stCC to Speech Analysis
- Variations on the Complex Cepstrum
A Critical Analysis of the Cepstrum and Conclusions Problems Coding, Enhancement and Quality Assessment Speech Coding and Synthesis Introduction Optimum Scalar and Vector Quantization - Scalar Quantization
- Vector Quantization
Waveform Coding - Introduction
- Time Domain Waveform Coding
- Frequency Domain Waveform Coding
- Vector Waveform Quantization
Vocoders - The Channel Vocoder
- The Phase Vocoder
- The Cepstral (Homomorphic) Vocoder
- Formant Vocoders
- Linear Predictive Coding
- Vector Quantization of Model Parameters
Measuring the Quality of Speech Compression Techniques Conclusions Problems Quadrature Mirror Filters Speech Enhancement Introduction Classification of Speech Enhancement Methods Short-Term Spectral Amplitude Techniques - Introduction
- Spectral Subtraction
- Summary of Short-Term Spectral Magnitude Methods
Speech Modeling and Wiener Filtering - Introduction
- Iterative Wiener Filtering
- Speech Enhancement and All-Pole Modeling
- Sequential Estimation via EM Theory
- Constrained Iterative Enhancement
- Further Refinements to Iterative Enhancement
- Summary of Speech Modeling and Wiener Filtering
Adaptive Noise Canceling - Introduction
- ANC Formalities and the LMS Algorithm
- Applications of ANC
- Summary of ANC Methods
Systems Based on Fundamental Frequency Tracking - Introduction
- Single-Channel ANC
- Adaptive Comb Filtering
- Harmonic Selection
- Summary of Systems Based on Fundamental Frequency Tracking
Performance Evaluation - Introduction
- Enhancement and Perceptual Aspects of Speech
- Speech Enhancement Algorithm Performance
Conclusions Problems The INTEL System Addressing Cross-Talk in Dual-Channel ANC Speech Quality Assessment Introduction - The Need for Quality Assessment
- Quality Versus Intelligibility
Subjective Quality Measures - Intelligibility Tests
- Quality Tests
Objective Quality Measures - Articulation Index
- Signal-to-Noise Ratio
- Itakura Measure
- Other Measures Based on LP Analysis
- Weighted-Spectral Slope Measures
- Global Objective Measures
- Example Applications
Objective Versus Subjective Measures Problems Recognition The Speech Recognition Problem Introduction - The Dream and the Reality
- Discovering Our Ignorance
- Circumventing Our Ignorance
The "Dimensions of Difficulty" - Speaker-Dependent Versus Speaker-Independent Recognition
- Vocabulary Size
- Isolated-Word Versus Continuous-Speech Recognition
- Linguistic Constraints
- Acoustic Ambiguity and Confusability
- Environmental Noise
Related Problems and Approaches - Knowledge Engineering
- Speaker Recognition and Verification
Conclusions Problems Dynamic Time Warping Introduction Dynamic Programming Dynamic Time Warping Applied to IWR - DTW Problem and Its Solution Using DP
- DTW Search Constraints
- Typical DTW Algorithm: Memory and Computational Requirements
DTW Applied to CSR - Introduction
- Level Building
- The One-Stage Algorithm
- A Grammar-Driven Connected-Word Recognition System
- Pruning and Beam Search
- Summary of Resource Requirements for DTW Algorithms
Training Issues in DTW Algorithms Conclusions Problems The Hidden Markov Model Introduction Theoretical Developments - Generalities
- The Discrete Observation HMM
- The Continuous Observation HMM
- Inclusion of State Duration Probabilities in the Discrete Observation HMM
- Scaling the Forward-Backward Algorithm
- Training with Multiple Observation Sequences
- Alternative Optimization Criteria in the Training of HMMs
- A Distance Measure for HMMs
Practical Issues - Acoustic Observations
- Model Structure and Size
- Training with Insufficient Data
- Acoustic Units Modeled by HMMs
First View of Recognition Systems Based on HMMs - Introduction
- IWR Without Syntax
- CSR by the Connected-Word Strategy Without Syntax
- Preliminary Comments on Language Modeling Using HMMs
Problems Language Modeling Introduction Formal Tools for Linguistic Processing - Formal Languages
- Perplexity of a Language
- Bottom-Up Versus Top-Down Parsing
HMMs, Finite-State Automata, and Regular Grammars A "Bottom-Up" Parsing Example Principles of "Top-Down" Recognizers - Focus on the Linguistic Decoder
- Focus on the Acoustic Decoder
- Adding Levels to the Linguistic Decoder
- Training the Continuous-Speech Recognizer
Other Language Models - N-Gram Statistical Models
- Other Formal Grammars
IWR As "CSR" Standard Databases for Speech-Recognition Research A Survey of Language-Model-Based Systems Conclusions Problems The Artificial Neural Network Introduction The Artificial Neuron Network Principles and Paradigms - Introduction
- Layered Networks: Formalities and Definitions
- The Multilayer Perceptron
- Learning Vector Quantizer
Applications of ANNs in Speech Recognition - Presegmented Speech Material
- Recognizing Dynamic Speech
- ANNs and Conventional Approaches
- Language Modeling Using ANNs
- Integration of ANNs into the Survey Systems of Section 13.9
Conclusions Problems Index Hardcover; 908 pages
|