*** You must use the Project 2 Template provided for your written report. Do not exceed the page limits stated in the template nor decrease the font size ***. (If you prefer not to use Word, you can copy and paste this format in a different editor as long as you respect the stated page structure and page limit.)
Additional information about this dataset from the authors, Prof. I-Cheng Yeh. Department of Civil Engineering. Tamkang University:This research employed a binary variable, default payment (Yes = 1, No = 0), as the response variable. This study reviewed the literature and used the following 23 variables as explanatory variables:
- X1: Amount of the given credit (NT dollar): it includes both the individual consumer credit and his/her family (supplementary) credit.
- X2: Gender (1 = male; 2 = female).
- X3: Education (1 = graduate school; 2 = university; 3 = high school; 0, 4, 5, 6 = others).
- X4: Marital status (1 = married; 2 = single; 3 = divorce; 0=others).
- X5: Age (year).
- X6 - X11: History of past payment. We tracked the past monthly payment records (from April to September, 2005) as follows: X6 = the repayment status in September, 2005; X7 = the repayment status in August, 2005; . . .;X11 = the repayment status in April, 2005. The measurement scale for the repayment status is: -2: No consumption; -1: Paid in full; 0: The use of revolving credit; 1 = payment delay for one month; 2 = payment delay for two months; . . .; 8 = payment delay for eight months; 9 = payment delay for nine months and above.
- X12-X17: Amount of bill statement (NT dollar). X12 = amount of bill statement in September, 2005; X13 = amount of bill statement in August, 2005; . . .; X17 = amount of bill statement in April, 2005.
- X18-X23: Amount of previous payment (NT dollar). X18 = amount paid in September, 2005; X19 = amount paid in August, 2005; . . .;X23 = amount paid in April, 2005.
- Y: client's behavior; Y=0 then not default, Y=1 then default
Use the following attributes as continuous:
X1, X5, X12, X13, X14, X15, X16, X17, X18, X19, X20, X21, X22, X23.Run some experiments using the following attributes as nominal and some other experiments using them as continuous, and compare the results:
X2, X3, X4, X6, X7, X8, X9, X10, X11.
Run experiments with and without discretizing the predicting attributes; removing attributes that are too related to the target or that make the trees too long; and with any other pre-processing and post-processing that produce useful and meaningful models.