WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS539 Machine Learning - Spring 2009 
Project 2 - Decision Trees

PROF. CAROLINA RUIZ 

Due Date: Tuesday, Feb. 10th 2009. Slides are due at 2:00 (by email) and Written Report is due at 3:30 pm (beginning of class). 
------------------------------------------

  1. Read Chapter 3 of the textbook about decision trees in great detail.

  2. Homework Assignment:

    1. Calculate the Gain(S,A1) and Gain(S,A2) for the dataset S and attributes A1 and A2 on Slide 8 of the textbook slides (Chapter 3). Show each step of the calculation. Include your solution in your written report (and not in your oral report).

    2. Consider the Gain(S,A) formula (Equation 3.4, p. 58 of your textbook). Is it the case that for any dataset S and for any attribute A in dataset S, Gain(S,A) &ge 0? If your answer is yes, provide a detailed proof. If your answer is no, provide a dataset S and an attribute A in that dataset such that Gain(S,A) < 0. Include your solution in your written report (and not in your oral report).

  3. Project Assignment: THOROUGHLY READ AND FOLLOW THE PROJECT GUIDELINES. These guidelines contain detailed information about how to structure your project, and how to prepare your written and oral reports.