EEL6586: HW#1

Next: EEL6586: HW#2 Up: EEL6586: Homework Assignments Previous: EEL6586: Homework Assignments

EEL6586: HW#1

Due Monday, February 5, 2001 in class. Late homework will lose $e^{\char93 ~of~days~late} -1$ percentage points. To see the current late penalty, click on
http://www.cnel.ufl.edu/analog/harris/latepoints.html

In this assignment you will record a phoneme and try to match it using the synthesis techniques discussed in class. You must both hand in your homework AND email your three audio files to to the TA (Mark Skowronski, markskow@cnel.ufl.edu) by the due date/time. Make sure that you properly answer all of the questions and describe your solution technique for each problem. You may talk to other students, in fact you are strongly encouraged to do so. However, the final work and matlab code you turn in MUST be your own. Some parts of this assignment are open-ended where there are many possible solution methods.

You writeup should contain an appendix that includes all of the matlab code that you wrote for this assignment. You do not need to include any of the code in Parts A, B, or C but you should describe your solution technique in these parts.

PART A: Recording Speech

A1: Record yourself on a computer saying the phoneme /i/ for about 0.5 seconds. Remember the /i/ is the vowel sound in ``heed''. Hand in the sound file. The usual format is an 8KHz .wav file.
A2: Hand in a portion of the time domain plots for the phoneme showing a few pitch periods. The axes of this plot (and all plots) should be clearly labelled. Clearly indicate the pitch period and note its numerical value.
A3: Plot the magnitude spectrum of the phoneme. Clearly indicate the values of F₁, F₂ and F₃ on the graphs.
A4: Estimate the bandwidth of each formant using whichever definition you like. However, be sure to explain your calculation.
A5: Plot the spectrogram of each vowel. Show results from using both short and long windowing functions.

PART B: Formant Synthesis

B1: Write a matlab program that can filter a signal using the sum of three bandpass filters. Each bandpass filter will be specified by a center frequency and a bandwidth. This is an open-ended question, use your best judgment in the filter design but explain your reasoning. Hint: if your formants are too narrow then your phoneme will sound like a musical tone.
B2: Use the code in [B1] to filter a train of impulses of appropriate pitch to mimic the vowel sounds from part A.
B3: Filter an impulse train of Rosenberg pulses-assume a duty cycle of about 50%. Feel free to tweak other parameters in order to improve the quality of the sound. Hand in a 1/2 second sound file (8KHz .wav file) of your best synthetic sound.
B4: Plot time and frequency domain representations of the vowel (don't use spectrograms). Compare your synthetic sound results to the recorded sound. Do they seem reasonably close?
B5: Listen to the real and synthetic sounds. Do they sound reasonably close?

PART C: Articulatory Synthesis

C1: Implement the discrete-time vocal tract model discussed in class. Discretize the length $\ell$ vocal tract into N stages.
C2: Produce the /i/ phoneme by filtering a train of impulses through the digital vocal tract model. Using appropriate lip and glottis models, produce a segment of speech for this vowel. Remember that this method produces a signal that could be sampled much higher than is reasonable. We only trust frequencies up to about 4KHz. Also, the model does not provide for losses in the vocal tract so some of the higher frequencies are not attenuated as much as they should be. For the area function, use the values given on the last page.
C3: Instead of using a train of impulses as input, use a train of Rosenberg pulses. Adjust the vocal tract length $\ell$ and do anything else to improve the quality of the sound. Hand in a 1/2 second sound file (8KHz .wav file) of your best synthetic sound.
C4: Plot time and frequency domain representations of the vowel (don't use spectrograms). Compare your synthetic sound results to the recorded sound. Do they seem reasonably close?
C5: Listen to the real and synthetic sounds. Do they sound reasonably close?

Vocal Tract Area Function for /i/ for a Male Speaker

Distance from Glottis (cm) Cross section area (cm²)

0.0 3.2

0.5 2.6

1.0 2.0

1.5 2.0

2.0 8.0

2.5 8.0

3.0 10.5

3.5 10.5

4.0 10.5

4.5 10.5

5.0 10.5

5.5 10.5

6.0 10.5

6.5 10.5

7.0 10.5

7.5 8.0

8.0 8.0

8.5 6.5

9.0 4.0

9.5 2.6

10.0 1.3

10.5 0.65

11.0 0.65

11.5 0.65

12.0 0.65

12.5 0.65

13.0 0.65

13.5 0.65

14.0 1.0

14.5 1.3

15.0 1.6

15.5 3.2

16.0 4.0

16.5 4.0

Next: EEL6586: HW#2 Up: EEL6586: Homework Assignments Previous: EEL6586: Homework Assignments

Dr John Harris
2001-04-05

Distance from Glottis (cm)	Cross section area (cm²)
0.0	3.2
0.5	2.6
1.0	2.0
1.5	2.0
2.0	8.0
2.5	8.0
3.0	10.5
3.5	10.5
4.0	10.5
4.5	10.5
5.0	10.5
5.5	10.5
6.0	10.5
6.5	10.5
7.0	10.5
7.5	8.0
8.0	8.0
8.5	6.5
9.0	4.0
9.5	2.6
10.0	1.3
10.5	0.65
11.0	0.65
11.5	0.65
12.0	0.65
12.5	0.65
13.0	0.65
13.5	0.65
14.0	1.0
14.5	1.3
15.0	1.6
15.5	3.2
16.0	4.0
16.5	4.0