Tutorial (7): Exercise 3
Limitations of Information Gain
The dataset:
| Date | District | House Type | Income | Previous Customer |
Outcome |
| 3/10/03 | Suburban | Detached | High | No | Nothing |
| 14/9/03 | Suburban | Detached | High | Responded | Nothing |
| 2/4/02 | Rural | Detached | High | No | Responded |
| 18/1/03 | Urban | Semi-detached | High | No | Responded |
| 3/4/03 | Urban | Semi-detached | Low | No | Responded |
| 15/10/02 | Urban | Semi-detached | Low | Responded | Nothing |
| 15/10/02 | Rural | Semi-detached | Low | Responded | Responded |
| 2/3/01 | Suburban | Terrace | High | No | Nothing |
| 4/5/03 | Suburban | Semi-detached | Low | No | Responded |
| 2/1/03 | Urban | Terrace | Low | No | Responded |
| 3/10/03 | Suburban | Terrace | Low | Responded | Responded |
| 3/10/03 | Rural | Terrace | High | Responded | Responded |
| 8/4/03 | Rural | Detached | Low | No | Responded |
| 6/5/02 | Urban | Terrace | High | Responded | Nothing |
The Decision Tree: Interactively build it
- Click on the root node below and start building the tree.
- Non leaf nodes can be "pruned" once they have been chosen (by clicking on the node and selecting "prune node completely")
- The ratios on the branches indicate how well the chosen attribute at a node splits the remaining data based on the target attribute ('outcome').
- Click on any nodes to hilight the rows in the data table that the rule down to that node covers.
- At each node, the entropy of the data at that point in the tree will be given.
- Information gain (entropy reduction) is specified for each attribute.
-
Reducing entropy to zero is a way of building a decision tree here.
When no more nodes can be expanded, the tree has classified all the training data. - Notice that the date attribute is calculated as having a high information gain.
- This would be used as the root node in algorithms such as ID3. It splits the data effectively, but is it a good classifier? What would happen if we tried to use such a tree for prediction?
| root node |
your tutorial is very helpful
u presented ur tutorial with a good simulator, helped me 2 underatand.
Thank you