next up previous
Next: Project Webpages and Schedule Up: EEL6825: Projects Previous: EEL6825: Projects

Project Handout




Final report due Wednesday, December 6 at midnight. Late reports suffer the usual late penalty. Your final project consists of a significant portion of the grade in this class. Important dates are as follows:


Your grade for the project will be based on the on-time completion and quality of each of the above.

Project Presentation

You will not be graded on how good a speaker you are, but on the work you have done and how well you prepared for the talk. Presentations should get better with each day since later students have had more time to prepare. Everyone will attend class each of the five days of student presentations. Please let the instructor know in advance if you cannot attend.

Project Report

Your final report must include the following topics

1.
Linear Classifier results.
2.
Bayes Classifier results
3.
k-NN Classifier results.
4.
Neural network results.
5.
Dimensionality reduction using KL Transform or other technique.
6.
A twist (something new and different) Some examples are given below.
7.
An interpretation of the results. For example, what do the results tell you about the data or the classifiers that you are using.

Your final project report will be a web page-you do not need to print it out. Just email the address to the instructor. Most word processors are capable of outputing html code so this should not be a big hassle. A big advantage of using a webpage for your report is that you can include color figures and audio/video signals. If you have never designed a webpage before, this is your opportunity to learn. Signe Redfield, the TA, is an expert on web pages and can help you with any problems you might have. The report should be written as if it were to be submitted to a conference and therefore should contain the following components:

1.
A concise description of the problem.
2.
A summary of previous solutions to the problem. You should include at least one reference to a paper you have read (not a textbook).
3.
A detailed description of your solution to the problem.
4.
Matlab simulation results.
5.
A discussion of the significance of these results and how your solution differs from previous attempts.
6.
The appendix should contain messy derivations and any other information too detailed to keep in the main body.
7.
DO NOT INCLUDE ANY MATLAB CODE ON YOUR WEBPAGE.

Project Topics

You are strongly encouraged to come up with your own idea for a project based on your own experience. Extra points given for novelty and creativeness. You are welcome to work on two-person projects. Two-person teams need only turn in one project report and send one email per week, but remember that a two-person project is expected to be twice as much work as a one-person project.

1.
Study the change in error rate with respect to.

These experiments are best done with synthetic data.
2.
Study some other classifiers that we haven't talked about in class and compare them to the conventional methods we have discussed. These methods might include piece-wise classifiers.
3.
Choose a novel domain that requires some special consideration or feature extraction. You may find some interesting data though the internet. For example, take a look at some of the benchmark data sets given in
4.
Speech and character recognition are both very challenging problems but would make excellent projects. For both of these problems, feature extraction is the key step.

The following datasets are available in the UCI database and some have been used in past years for projects:

1.
Wisconsin Breast cancer databases: Currently contains 699 instances, 2 classes (malignant and benign), 9 integer-valued attributes
2.
Credit Screening Database: a good mix of attributes - continuous, nominal with small numbers, of values, and nominal with larger numbers of values, 690 instances, 15 attributes some with missing values.
3.
Echocardiogram database: Documentation: sufficient, 13 numeric-valued attributes, Binary classification: patient either alive or dead after survival period
4.
Glass Identification database: Documentation: completed 6 types of glass Defined in terms of their oxide content (i.e. Na, Fe, K, etc) All attributes are numeric-valued
5.
David Slate's letter recognition database (real): 20,000 instances (712565 bytes) (.Z available), 17 attributes: 1 class (letter category) and 16 numeric (integer), No missing attribute values.
6.
Mushrooms in terms of their physical characteristics and classified as poisonous or edible (Audubon Society Field Guide): Documentation: complete, but missing statistical information, All attributes are nominal-valued, Large database: 8124 instances (2480 missing values for attribute #12)
7.
Congressional voting records classified into Republican or Democrat (1984 United Stated Congressional Voting Records) Documentation: completed, All attributes are Boolean valued; plenty of missing values; 2 classes
8.
Wine Recognition database: Using chemical analysis determine the origin of wines, 13 attributes (all continuous), 3 classes, no missing values, 178 instances.


next up previous
Next: Project Webpages and Schedule Up: EEL6825: Projects Previous: EEL6825: Projects
Dr John Harris
2000-12-03