Tutorial (9): Exercise 4
Exercise 4: Using Gain Ratio as a Splitting Criteria
| Date |
District |
House Type |
Income |
Previous
Customer |
Outcome |
| 3/10/03 |
Suburban |
Detached |
High |
No |
Nothing |
| 14/9/03 |
Suburban |
Detached |
High |
Responded |
Nothing |
| 2/4/02 |
Rural |
Detached |
High |
No |
Responded |
| 18/1/03 |
Urban |
Semi-detached |
High |
No |
Responded |
| 3/4/03 |
Urban |
Semi-detached |
Low |
No |
Responded |
| 15/10/02 |
Urban |
Semi-detached |
Low |
Responded |
Nothing |
| 15/10/02 |
Rural |
Semi-detached |
Low |
Responded |
Responded |
| 2/3/01 |
Suburban |
Terrace |
High |
No |
Nothing |
| 4/5/03 |
Suburban |
Semi-detached |
Low |
No |
Responded |
| 2/1/03 |
Urban |
Terrace |
Low |
No |
Responded |
| 3/10/03 |
Suburban |
Terrace |
Low |
Responded |
Responded |
| 3/10/03 |
Rural |
Terrace |
High |
Responded |
Responded |
| 8/4/03 |
Rural |
Detached |
Low |
No |
Responded |
| 6/5/02 |
Urban |
Terrace |
High |
Responded |
Nothing |
- Click on the root node below and start building the tree.
- Non leaf nodes can be "pruned" once they have been chosen (by clicking on the node and selecting "prune node completely")
- The ratios on the branches indicate how well the chosen attribute at a node splits the remaining data based on the target attribute ('outcome'). s
- Click on any nodes to hilight the rows in the data table that the rule down to that node covers.
- At each node, the entropy of the data at that point in the tree will be given.
- Information gain (entropy reduction) is specified for each attribute.
-
Reducing entropy to zero is a way of building a decision tree here.
When no more nodes can be expanded, the tree has classified all the training data.
- Notice that the date attribute is calculated as having a high information gain.