Gain Ratio In Decision Tree
Table Of Contents:
- What Is Gain Ratio In Decision Tree?
- Example Of Gain Ratio
Interpreting Split Information
- What Is The Range Of Gain Ratio?
What We Want ?
Balanced, Unbalanced & Moderate Split
Which Split Information Is Better: Balanced, Unbalanced & Moderate Split
How Gain Ratio Penalized Lower Split Information?
Advantages Of Gain Ratio
Disadvantages Of Gain Ratio
(1) What Is Gain Ratio In Decision Tree?
- In decision tree learning, the Gain Ratio is an improvement over Information Gain to evaluate splits.
- While Information Gain measures the effectiveness of a feature in classifying the data, it tends to favor features with many distinct values (like a unique identifier).
- The Gain Ratio is introduced to overcome this bias.
(2) Example Of Gain Ratio
(3) Interpreting Split Information
- Split Information measures how “uniformly” the data is distributed across the subsets.
- Split Information is a key component of the Gain Ratio in decision trees.
- Its purpose is to measure how the data is divided among the branches resulting from a split.
- Understanding its impact requires exploring its relationship with Information Gain and the decision tree’s splitting process.
(4) What Is The Range Of Gain Ratio?
- The range of Gain Ratio is 0 to 1. Here’s why:
(5) What We Want ?
- Achieving higher Information Gain (IG) and higher Split Information (SI) simultaneously is possible when a split divides the data both meaningfully and evenly in terms of target variable purity and subset size. Here’s how:
(6) Balanced, Unbalanced & Moderate Split
(7) Which Split Information Is Better: Balanced, Unbalanced & Moderate Split
(8) How Gain Ratio Penalized Lower Split Information?
(9) Gain Ratio Of An ID Attribute Having 10 Values.
- The ID attribute is a unique identifier, meaning each record has a unique value.
- When splitting on this attribute, each subset will contain exactly one record. This leads to the following characteristics:
(10) Advantages Of Gain Ratio
(11) Disadvantages Of Gain Ratio
