WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS539 Machine Learning - Spring 2005 
Project 5 - Bayesian Learning

PROF. CAROLINA RUIZ 

Due Date: Tuesday, March 15 2005. Slides are due at 3:00 pm and the written report is due at 4:00 pm. 
------------------------------------------


PROJECT DESCRIPTION

Experiment with Naive Bayes and Bayesian Net classifiers for each of the following problems:

  1. Predicting the class attribute in the Covertype data available at the UCI Machine Learning Repository.

  2. Predicting whether the income of a given person is >50K or <= 50K using the census-income dataset from the US Census Bureau which is available at the Univ. of California Irvine Repository.
    The census-income dataset contains census information for 48,842 people. It has 14 attributes for each person (age, workclass, fnlwgt, education, education-num, marital-status, occupation, relationship, race, sex, capital-gain, capital-loss, hours-per-week, and native-country) and a boolean attribute class classifying the input of the person as belonging to one of two categories >50K, <=50K.

PROJECT ASSIGNMENT

  1. Read Sections 6.1, 6.2, 6.7, 6.8, 6.9, 6.10, 6.11, 6.12, 6.13 of your textbook in great detail.

  2. Read the NaiveBayes and the BayesNets code in the Weka system.

  3. The following are guidelines for the construction of your Naive Bayes and Bayesian Net Classifiers:


REPORT AND DUE DATE