Speech

We began the course by focusing the application on computer music. In the context of computer music, the students experimented with sampling, aliasing, and filtering. We switch the application to speech processing to explore more about filtering, as well as signal quantization, coding, and interpretation.

In week #6, we introduce difference equations as a general framework to implement filters. We demonstrate that the complex exponential is a solution to difference equations. We cover the damped, oscillating, and underdamped cases by showing how the complex exponential behaves. At this point and throughout the course, we avoid using the -transform. In the laboratory, the students get an introduction to the C50 boards and run several canned filtering demonstrations on them.

Next, we detail digital representation of signals on a computer by means of sampling and quantization. In the lecture, we play a speech signal quantized at various levels. To the surprise of the students, they find that speech quantized at one bit is actually intelligible. The students through hearing perceive the tradeoff between more bits and improved perceptual quality. In the laboratory, they discover the limit of perceptual improvement as they increase the number of bits. They also perform simple filtering operations on speech.

Now that we have introduced quantization, we spend the next two weeks talking about embedded digital system architectures, focusing on the C50 fixed-point DSP. In the first laboratory, the students generate tones by using table lookup and by using a difference equation on the C50 boards. We provide the routines to handle the input and output for them so they can concentrate on the algorithm. In the second laboratory, the students generate sequential tones and multiple tones in real-time on the C50. These two laboratories are in preparation for a dual-tone multiple frequency generator they build two labs from now.

The next two topics concern advanced speech processing topics of pitch shifting and speech recognition. In the first lecture, we discuss simple models of how speech is produced, and relate the models to difference equations. We discuss pitch and various ways to measure it. Then, we give an example of pitch-shifting by playing a Laurie Anderson CD. In the laboratory, the students run speech through a linear predictive coder (LPC). They are given the infrastructure to compute LPC coefficients. The students figure out how to window the speech and synthesize the same speech from the LPC model. Creative students use a variety of excitation models. In the second lecture, we introduce speech recognition, and in the laboratory, students implement pieces of a simple speech recognition system.

Next: Digital Communications Up: Real-Time DSP For Sophomores Previous: Computer Music and

Brian L. Evans, 211-105 Cory Hall, Berkeley, CA 94720-1772