DUE DATE: Tuesday, April 28th, 2015.
- Slides: Submit by email to cs548-staff mailing list by 3:00 pm.
- Written report: Hand in a hardcopy by the beginning of class by 5:59 pm.
Project Description
- Project Instructions:
- Work on this project individually. That is, no group work in allowed for
this project.
-
THOROUGHLY READ AND FOLLOW THE
PROJECT GUIDELINES.
These guidelines contain detailed information about how to structure your
project, and how to prepare your written and oral reports.
- Your class presentation must be at most 5 minutes long.
- Data Mining Technique(s):
Choose ONE of the following topics for your project:
- anomaly detection
- web mining
- text mining
- sequence mining
- multimedia data mining
For your chosen topic, you must use data mining techniques studied in
this course to address that topic (e.g., using neural networks is not
allowed as we didn't study neural networks in this course).
- Dataset:
Choose a dataset appropriate for the data mining topic that you selected for this project and related to your own interests.
This dataset should contain enough instances and attributes to provide sufficient data for fruitful and interesting experiments.
Here are some possibilites:
- A dataset you are working with for your research or your job.
- A dataset from a data repository listed in
the online resources,
or other online data repository.
- Other data source of your choice.
- Performance Metric(s):
Use performance metrics appropriate to the mining application that you
chose. If you are not aware of any,
propose a variety of approaches to measure how good the results
of your experiments are.
Consider using visualization of the constructed model or patterns
to evaluate your results.
The more creative/ingenious your approaches, the better.
You might want to extend the Weka code to provide the
evaluation/interpretation functionality you need.
- General Comments
- You can run your experiments in Weka and/or in Python (ideally in both).
- Remember that you can only use data mining techniques studied in class
(in case of doubt, ask).
- Focus on experimenting with different ways of preprocessing
the data, adapting different techniques studied in this course
to tackle the problem at hand, and investigating on your own
other existing approaches.
The more creative/ingenious your work and/or the more research
into the related literature you do, the better.
- Extra credit will be given to particularly
creative and/or high quality work, and/or for independently
researching the data mining chosen beyond what was covered in class.
- Project 5 Report Template