Next: EEL6825: HW#4
Up: EEL6825: Homework Assignments
Previous: EEL6825: HW#2
Due Wednesday, October 21, 1998 in class. Do not be late to class. Late
homework will lose
percentage points.
Click on
(http://www.cnel.ufl.edu/analog/harris/latepoints.html).
Also, I will not look at any part of your assignment on the computer.
Please hand in a hardcopy of all plots and all of your Matlab code.
PART A: Textbook Problems
- A1
- 4.3 in DH&S
- A2
- Assuming that
and that you are given
n data points from each of two classes.
The Parzen classifier is expressed by
where the superscripts denote the class of each data point.
Prove that the leave-one-out error is larger than the resubstitution
error. Assume that
.
- A3
- Suppose you are given data as
X1=(1, 0)T
X2=(0, 1)T
X3=(0, -1)T
X4=(0, 0)T
X5=(0, 2)T
X6=(0, -2)T
X7=(-2, 0)TSuppose the first 3 are labeled
and the remaining 4 are labeled
.
Sketch the decision boundary resulting from using
the nearest-neighbor rule for classification.
- A4
- Two one-dimensional distributions are given as uniform in
[0,1] for
and uniform in [0,2] for
.
Assuming that
and that an infinite number of samples is
available
- 1.
- Compute the Bayes error.
- 2.
- Compute the expected probability of error for the 1-NN
leave-one-out procedure.
- 3.
- Compute the expected probability of error for the 2-NN leave-one-out
procedure. Do not
include the sample being classified and assume that ties are rejected.
- 4.
- Explain why the 2-NN error computed in part 3 is less than the Bayes
error. Note that you can and should still answer this part even
if you didn't get the above parts correct.
(turn over)
PART B: Computer Experiment: The Mines and Rocks Problem
The programming part of this assignment uses the data set developed by
Gorman and Sejnowski in their study of the classification of sonar signals
using a neural network. The task is to train a network to discriminate
between sonar signals bounced off a metal cylinder and those bounced off a
roughly cylindrical rock.
The file ``mines.asc''
(http://www.cnel.ufl.edu/analog/courses/EEL6825/mines.asc)
contains 111 patterns obtained by bouncing sonar signals off a metal
cylinder at various angles and under various conditions. The file
``rocks.asc''
(http://www.cnel.ufl.edu/analog/courses/EEL6825/rocks.asc)
contains 97 patterns obtained from rocks under similar
conditions. The transmitted sonar signal is a frequency-modulated chirp,
rising in amplitude. The data set contains signals obtained from a variety
of different aspect angles, spanning 90 degrees for the cylinder and 180
degrees for the rock. Each pattern is a set of 60 numbers in the range 0.0
to 1.0. Each number represents the energy within a particular frequency
band, integrated over a certain period of time. The integration aperture
for higher frequencies occur later in time, since these frequencies are
transmitted later during the chirp. A
README
file in the directory contains
a longer description of the data and past experiments.
- B1
- Compute the sample mean for each class. Design a linear
classifier that chooses the class with the nearest sample mean. Use the
Euclidian distance, don't worry about covariance matrices.
Compute the resubstitution and the
leave-one-out errors. Clearly indicate these results in your answers.
As usual turn in all code that you write.
- B2
- Design a nearest-neighbor classifier that chooses the class of the
nearest-neighbor for each X. Compute the resubstitution and the
leave-one-out errors. Clearly indicate these results in your answers.
Programming hint: Do not use all of the data points when you are developing
your code. When you are confident that your program is correct, run with
the full number of points. Also, write your code with efficiency in mind.
If the full number of points still takes too long to run, use as many points
as you think reasonable but explain what you have done.
- B3
- Plot a graph that shows the leave-one-out
performance of your classifier that is similar to the d2 display we
discussed in class. The Y-axis represents the distance between each point
in the data set and its nearest neighbor in the mines class. If the data
point happens to come from the mines class, leave it out of the minimum
distance computation. Similarly, the X-axis is the distance between each
data point and its nearest neighbor in the rocks class. (None of the
distances should be exactly zero since you are using the leave-one-out
method). Plot a line on the plot that shows your solution to problem B2.
Change the offset and slope of this line the best you can to minimize the
error. What is the best error that you can get?
- B4
- Extra credit (optional). Compute the same errors as problem B2 for
3-NN. Does 3-NN perform better or worse than 1-NN?
Next: EEL6825: HW#4
Up: EEL6825: Homework Assignments
Previous: EEL6825: HW#2
Dr John Harris
1998-12-19