next up previous
Next: EEL6825: HW#1 Up: EEL6825: Pattern Recognition Fall Previous: EEL6825: Course Outline

EEL6825: Projects

Due December 10, 1996 at 3pm.

Your final project consists of a significant portion of the grade in this class. Everyone must have a tentative project idea by Thursday, October 31. Important dates are as follows:

Your final grade for the project will be based on the on-time completion and quality of each of the above items. Your final report must include the following topics
  1. Linear Classifier results.
  2. Bayes Classifier results
  3. k-NN Classifier results.
  4. Dimensionality reduction using KL Transform or other technique.
  5. A twist (something new and different) Some examples are given below.
  6. An interpretation of the results. For example, what do the results tell you about the data or the classifiers that you are using.
Your final project report should be written as if it were to be submitted to a conference and therefore should contain the following components:
  1. A short literature review about the topic, you should include at least one reference to a paper you have read (not a textbook).
  2. A concise description of the problem.
  3. A detailed description of your solution to the problem.
  4. Matlab simulation results.
  5. A discussion of the significance of these results.
  6. The appendix should contain complete MATLAB codes, messy derivations and any other information too detailed to keep in the main body.
You are strongly encouraged to come up with your own idea for a project based on your own experience, however some suggestions include:
  1. Study the change in error rate with respect to. These experiments might be best done with synthetic data.
  2. Study some other classifiers that we haven't talked about in class and compare them to the conventional methods we have discussed. These methods might include piece-wise classifiers or neural networks (if you have already taken this course)
  3. Choose a novel domain that requires some special consideration or feature extraction. You may find some interesting data though the internet. For example, take a look at some of the benchmark data sets given in
  4. As was said in class, the following homeworks will be assigned during the second half of this semester: Speech and character recognition are both very challenging problems but would both make excellent projects. We just have to make sure that you do something more involved (or different) than what we will be doing in the homework. For both of these problems, feature extraction is the key step.
Some example projects worked on last year included (from the CMU database) The following datasets are available in the UCI database and some have been used in past years for projects:
  1. Wisconsin Breast cancer databases: Currently contains 699 instances, 2 classes (malignant and benign), 9 integer-valued attributes
  2. Credit Screening Database: a good mix of attributes - continuous, nominal with small numbers, of values, and nominal with larger numbers of values, 690 instances, 15 attributes some with missing values.
  3. Echocardiogram database: Documentation: sufficient, 13 numeric-valued attributes, Binary classification: patient either alive or dead after survival period
  4. Glass Identification database: Documentation: completed 6 types of glass Defined in terms of their oxide content (i.e. Na, Fe, K, etc) All attributes are numeric-valued
  5. David Slate's letter recognition database (real): 20,000 instances (712565 bytes) (.Z available), 17 attributes: 1 class (letter category) and 16 numeric (integer), No missing attribute values.
  6. Mushrooms in terms of their physical characteristics and classified as poisonous or edible (Audobon Society Field Guide): Documentation: complete, but missing statistical information, All attributes are nominal-valued, Large database: 8124 instances (2480 missing values for attribute #12)
  7. Congressional voting records classified into Republican or Democrat (1984 United Stated Congressional Voting Records) Documentation: completed, All attributes are Boolean valued; plenty of missing values; 2 classes
  8. Wine Recognition database: Using chemical analysis determine the origin of wines, 13 attributes (all continuous), 3 classes, no missing values, 178 instances.


next up previous
Next: EEL6825: HW#1 Up: EEL6825: Pattern Recognition Fall Previous: EEL6825: Course Outline

John Harris
Tue Nov 19 07:44:32 EST 1996