Project Assignment:
THOROUGHLY READ AND FOLLOW THE
PROJECT GUIDELINES.
These guidelines contain detailed information about how to structure your
project, and how to prepare your written and oral reports.
Your written report should be at most 6 pages long
(including all the graphs, figures, and appendices).
The font size should be no smaller than 11pts.
- Data Mining Technique(s):
Use Instance-based Learning and Regression techniques to construct
classifiers for each of the following problems:
Use the following Instance-based Learning and Regression
methods implemented in the Weka system, Matlab, or
implement your own code.
- Instance-based Learning:
- IB1: nearest neighbor classification
- IBk: k-nearest neighbors classification. Experiment with several values of k.
- Regression:
- Linear Regression
- LWR: Locally Weighted Regression
[In order to run locally weighted linear regression using
Weka, use LWL (locally weighted learning) from the Weka's
lazy classifiers, and select "Linear Regression" for the
LWL's classifier option]
- Dataset(s):
In this project, we will use one dataset:
-
The
census-income (also called "adult") dataset
from the US Census Bureau which is
available at the
Univ. of California Irvine (UCI) Data Repository.
- Target Attribute:
Run a set of experiments using the given target attribute (nominal), and
then run a separate set of experiments using a numeric attribute of your choice
as the target attribute.
- Training and Testing Instances:
You may restrict your test set to 100 data instances or less
(not used for training) to reduce the time taken by the experiments
using these lazy methods.
- Project 6 Grading Sheet.