Project Assignment:
THOROUGHLY READ AND FOLLOW THE
PROJECT GUIDELINES.
These guidelines contain detailed information about how to structure your
project, and how to prepare your written and oral reports.
*** Your written report should be at most 10 pages long
(including the homework solutions, and all the graphs, figures, and appendices).
The font size should be no smaller than 11pts.
- Data Mining Technique(s):
Use the Naive Bayes and Bayesian Net classification
methods implemented in Weka and in Matlab.
- Dataset(s):
In this project, we will use two datasets:
- Performance Metric(s):
- Use classification accuracy, time to construct the model,
dependency connections in the Bayesian graph, conditional probability
tables (CPTs), and any other
related information or metrics when
you evaluate the "goodness" of your models.
- Compare the classification accuracies/errors you obtained against those of
benchmarking techniques or previously studied techniques as
ZeroR, OneR, ID3, J4.8, ANNs over the same (sub-)set
of data instances you used in each experiment.
Use the experimenter in Weka to compare the performance
of these different techniques, with a statistical significance
threshold p=0.05.
- Algorithm Options:
- Advanced Topic(s) (30 points):
Investigate in more depth (experimentally, theoretically, or both) a topic of your
choice that is related to Bayesian learning
and that is not covered already in this project.
This Bayesian learning related topic might be something that was described or
mentioned in the textbook or in class, or that comes from your own research,
or that is related to your interests.