DESIGN & IMPLEMENTATION OF A REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM WITH VLIW DIGITAL SIGNAL PROCESSOR ARCHITECTURE
This thesis explores the feasibility of mapping a real-time, continuous speech recognition system onto a multi-core Digital Signal Processor architecture. While a pure hardware solution is capable of implementing the entire recognition process in real-time, the design process can be lengthy and inflexible to changes. However, a low-end embedded processor such as ARM7 is insufficient to execute in real-time. As a result, a more flexible and powerful DSP solution with Texas Instruments¡¦ C6713 multi-core DSP is used to exploit the instruction level parallelism within the speech recognition process. By exploiting the parallelism using 7 optimization techniques, the performance of the recognition process can be real-time on a 300 MHz DSP for a 1000 word vocabulary. At its core, continuous speech recognition is essentially a matching problem. The recognition process can be divided into four major phases: Feature Extraction, Acoustic Modeling, Phone Modeling and Word Modeling. Each phase is analyzed in detail to identify performance issues. In short, the major issues are its massive computations and large memory bandwidth. After applying various optimizations, the overall computational performance has improved from about 15 times slower than real-time to 1.6 times faster than real-time with the hardware. Through utilization of Direct Memory Access and larger cache memory, the memory bandwidth problem can be solved. The conclusion is that a multi-core DSP running at 300 MHz would be sufficient to implement a 1000 word Command & Control type application using the optimization techniques described in this thesis.
Advisor:Raymond R. Hoare; Steven P. Levitan; Alex K. Jones
School:University of Pittsburgh
School Location:USA - Pennsylvania
Source Type:Master's Thesis
Date of Publication:06/13/2007