Exercise 4: Using Gain Ratio as a Splitting Criteria
The dataset:
Date  District  House Type  Income  Previous Customer 
Outcome 
3/10/03  Suburban  Detached  High  No  Nothing 
14/9/03  Suburban  Detached  High  Responded  Nothing 
2/4/02  Rural  Detached  High  No  Responded 
18/1/03  Urban  Semidetached  High  No  Responded 
3/4/03  Urban  Semidetached  Low  No  Responded 
15/10/02  Urban  Semidetached  Low  Responded  Nothing 
15/10/02  Rural  Semidetached  Low  Responded  Responded 
2/3/01  Suburban  Terrace  High  No  Nothing 
4/5/03  Suburban  Semidetached  Low  No  Responded 
2/1/03  Urban  Terrace  Low  No  Responded 
3/10/03  Suburban  Terrace  Low  Responded  Responded 
3/10/03  Rural  Terrace  High  Responded  Responded 
8/4/03  Rural  Detached  Low  No  Responded 
6/5/02  Urban  Terrace  High  Responded  Nothing 
The Decision Tree: Interactively build it
 Click on the root node below and start building the tree.
 Non leaf nodes can be "pruned" once they have been chosen (by clicking on the node and selecting "prune node completely")
 The ratios on the branches indicate how well the chosen attribute at a node splits the remaining data based on the target attribute (‘outcome’). s
 Click on any nodes to hilight the rows in the data table that the rule down to that node covers.
 At each node, the entropy of the data at that point in the tree will be given.
 Information gain (entropy reduction) is specified for each attribute.

Reducing entropy to zero is a way of building a decision tree here.
When no more nodes can be expanded, the tree has classified all the training data.  Notice that the date attribute is calculated as having a high information gain.
 The gain ratio of an attribute is now also shown at each node construction phase, after the Information Gain value.
 See how the two differ and explore the types of trees that each produces.
 If we are to assume that the Date has no bearing on the Outcome, then which method produces the smaller trees?
root node 
First off: Thanks a lot for this tutorial…
But I think there is a mistake in the example above:
when calculating intrinsic split info for the atribute Previous Customer, you seem to find exactly 1 (since the gain ratio is equal to gain)… But that doesn't seem possible: 8/14*log2(8/14)+6/14*log2(6/14) != 1… Even when rounding up, the value of the Gain Ratio for 'Previous Customer' should be 0.049, not 0.048…
Also, the definition of VI in the previous page is rather obscure… shouldn't it simply be: sum(pi * log2(pi)?