WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS 548 KNOWLEDGE DISCOVERY AND DATA MINING - Fall 2019  
Project 2: Decision Trees, Linear Regression, Model Trees, Regression Trees

PROF. CAROLINA RUIZ 

DUE DATE: Thursday October 10th, 2019 at 2:00 pm.
------------------------------------------

Project Assignment:

  1. Group Project: This is a group project. Students should work in groups of 2 students. Please do not split the project in a way that each student does only a portion of the work. Instead each student is expected to work on the entire project individually and then meet with the group to clarify doubts, share findings, and combine the project solutions into one group report. Submit just one written report. Help or assistance from other groups, other people, or online resources is NOT allowed.

  2. Study Chapter 3, Sections 10.1, 10.3 and Appendix D (online) of the textbook in great detail.

  3. Study decision tree prunning using the Weka book: Section 6.1.

  4. Study linear regression, model trees and regression trees using the Weka book: Sections 3.2, 3.3, 4.6 and

  5. Study all the materials posted on the course Lecture Notes: In particular, you should know the algorithms to construct decision trees, regression trees, and model trees very well, and be able to use these algorithms to construct trees from data by hand during the test. See examples provided in the Lecture Notes linked above. (Note: for model and regression trees, a software tool will be used to obtain the necessary linear regressions.)

  6. THOROUGHLY READ AND FOLLOW THE PROJECT GUIDELINES. These guidelines contain detailed information about how to structure your project, and how to prepare your written summary, and how to study for the test.

    *** You must use the Project 2 Template provided here for your written report. Do NOT change the structucture of the report, do NOT exceed the page limits stated in the template and do NOT decrease the font size ***. (If you prefer not to use Word, you can copy and paste this format in a different editor as long as you respect the stated page structure and page limit.)