Specific Guidelines for the Experiments and Written Report
Your written report should follow the structure described below:
- Page 1 should contain each of the team members' names,
and a description of who did what on this project.
- [10 points] Algorithms and Code:
Page 2 should provide algorithmic descriptions of the data mining
techniques and of any advanced techniques that you used
in the project.
Read the Weka code that implements the techniques.
In your written report, describe the algorithm underlying the code
IN YOUR OWN WORDS.
Briefly explain the algorithm in terms of the inputs it receives
and the outputs it produces, AND the main steps it follows to construct
the model. Make sure that the algorithms you described are the ones
implemented in the Weka code (which are not necessarily the same as the ones
described in the textbook or in class).
- [60 points]
Experiments:
The next pages should contain descriptions of your experiments
and answers to the provided challenges.
Use tables to summarize the data, pre-processing, settings, postprocessing,
and results of the experiments you run.
You should run a sufficient number of coherent experiments
(that is, the results of an experiment should motivate the next experiment(s)).
Each row in the table corresponds to an experiment and the columns are
as following:
- Data:
What data did you use to construct and test your model?
- Pre-Processing:
What pre-processing was done to the data
in order to improve the model's performance.
- Post-Processing:
What post-processing was done to the
model in order to improve the model's performance.
- Experimental Protocol:
Only if for some reason it is different from 10-fold cross-validation.
- Resulting model:
Describe the size and readability of the resulting model.
Include any relavant observations about the model.
- Performance of the resulting model:
- State what the performance of the model is.
- If applicable,
elaborate on the confusion matrix and/or other relevant
performance indicators.
- How long did it take Weka to construct this model?
- [20 points] Best Performing Model.
The next pages should provide a detailed description of the experiment
that yielded the best performing model you were able to construct in this project.
Elaborate on the aspects listed above as columns in the experiments table.
That is, include detailed descriptions of:
- pre-processing done,
- advanced techiques used,
- clever ideas used during the contruction of the model
Provide both motivation and justification for running this experiment in the
way that you did (an important aspect of this part of the report is story-telling).
Provide any useful graphs and visualizations.
- [10 points] Conclusions and Lessons Learned.
The last page of your report should include a discussion of:
- How the performance of the experiments in this project
compare with that of your best performing results from
previous projects on the same dataset.
- How well the particular data mining methods used in this
project worked on this dataset.
- What combination of pre-processing, parameter values, and/or
post-processing yielded particularly good results?
- What lessons you learned in this project and how you would use
these lessons on the next project.
- Strengths and weaknesses of your project.