Next: EEL6586: HW#2
Up: EEL6586: Homework Assignments
Previous: EEL6586: Homework Assignments
Due Monday, February 5, 2001 in class. Late
homework will lose
percentage points.
To see the current late penalty, click on
http://www.cnel.ufl.edu/analog/harris/latepoints.html
In this assignment you will record a phoneme and try to match it using
the synthesis techniques discussed in class. You must both hand in
your homework AND email your three audio files to to the TA (Mark
Skowronski, markskow@cnel.ufl.edu) by the due date/time. Make sure
that you properly answer all of the questions and describe your
solution technique for each problem. You may talk to other students,
in fact you are strongly encouraged to do so. However, the final work
and matlab code you turn in MUST be your own. Some parts of this
assignment are open-ended where there are many possible solution
methods.
You writeup should contain an appendix that includes all of the matlab code
that you wrote for this assignment. You do not need to include any of the
code in Parts A, B, or C but you should describe your solution technique in
these parts.
PART A: Recording Speech
- A1
- Record yourself on a computer saying the phoneme /i/
for about 0.5 seconds. Remember the /i/ is the vowel sound in ``heed''.
Hand in the sound file. The usual format is an 8KHz .wav file.
- A2
- Hand in a portion of the time domain plots for the phoneme
showing a few pitch periods. The axes of this plot (and all plots)
should be clearly labelled. Clearly indicate the pitch period and
note its numerical value.
- A3
- Plot the magnitude spectrum of the phoneme. Clearly
indicate the values of F1, F2 and F3 on the graphs.
- A4
- Estimate the bandwidth of each formant using whichever
definition you like. However, be sure to explain your calculation.
- A5
- Plot the spectrogram of each vowel. Show results from using
both short and long windowing functions.
PART B: Formant Synthesis
- B1
- Write a matlab program that can filter a signal using the
sum of three bandpass filters. Each bandpass filter will be
specified by a center frequency and a bandwidth. This is an
open-ended question, use your best judgment in the filter design
but explain your reasoning. Hint: if your formants are too narrow then your phoneme will sound like a musical tone.
- B2
- Use the code in [B1] to filter a train of impulses of
appropriate pitch to
mimic the vowel sounds from part A.
- B3
- Filter an impulse train of Rosenberg pulses-assume a duty
cycle of about 50%. Feel free to tweak other parameters in order
to improve the quality of the sound.
Hand in a 1/2 second
sound file (8KHz .wav file) of your best synthetic sound.
- B4
- Plot time and frequency domain representations of the
vowel (don't use spectrograms). Compare your synthetic sound
results to the recorded sound. Do they seem reasonably close?
- B5
- Listen to the real and synthetic sounds. Do they sound
reasonably close?
PART C: Articulatory Synthesis
- C1
- Implement the discrete-time vocal tract model discussed in
class. Discretize the length
vocal tract into N stages.
- C2
- Produce the /i/ phoneme by filtering a train of impulses
through the digital vocal tract model. Using appropriate lip and
glottis models, produce a segment of speech for this vowel.
Remember that this method produces a signal that could be sampled
much higher than is reasonable. We only trust frequencies up to
about 4KHz. Also, the model does not provide for losses in the vocal
tract so some of the higher frequencies are not attenuated as much
as they should be. For the area function, use the values given on the last page.
- C3
- Instead of using a train of impulses as input, use a train
of Rosenberg pulses. Adjust the vocal tract length
and do
anything else to improve the quality of the sound.
Hand in a 1/2 second
sound file (8KHz .wav file) of your best synthetic sound.
- C4
- Plot time and frequency domain representations of the
vowel (don't use spectrograms). Compare your synthetic sound
results to the recorded sound. Do they seem reasonably close?
- C5
- Listen to the real and synthetic sounds. Do they sound
reasonably close?
Vocal Tract Area Function for /i/ for a Male Speaker
Distance from Glottis (cm) |
Cross section area (cm2) |
0.0 |
3.2 |
0.5 |
2.6 |
1.0 |
2.0 |
1.5 |
2.0 |
2.0 |
8.0 |
2.5 |
8.0 |
3.0 |
10.5 |
3.5 |
10.5 |
4.0 |
10.5 |
4.5 |
10.5 |
5.0 |
10.5 |
5.5 |
10.5 |
6.0 |
10.5 |
6.5 |
10.5 |
7.0 |
10.5 |
7.5 |
8.0 |
8.0 |
8.0 |
8.5 |
6.5 |
9.0 |
4.0 |
9.5 |
2.6 |
10.0 |
1.3 |
10.5 |
0.65 |
11.0 |
0.65 |
11.5 |
0.65 |
12.0 |
0.65 |
12.5 |
0.65 |
13.0 |
0.65 |
13.5 |
0.65 |
14.0 |
1.0 |
14.5 |
1.3 |
15.0 |
1.6 |
15.5 |
3.2 |
16.0 |
4.0 |
16.5 |
4.0 |
Next: EEL6586: HW#2
Up: EEL6586: Homework Assignments
Previous: EEL6586: Homework Assignments
Dr John Harris
2001-04-05