In the previous article, we learned how Information Gain uses Entropy to help Decision Trees select the best feature for splitting data.

However, Entropy is not the only way to measure impurity.

Many Decision Tree algorithms use another metric called:

Gini Index

In fact, one of the most widely used Decision Tree algorithms, CART (Classification and Regression Trees), uses the Gini Index by default.

The purpose of the Gini Index is similar to Entropy:

Measure Impurity
Measure Uncertainty
Measure Class Mixing

The lower the impurity, the better the split.

In this article, we will understand the intuition behind the Gini Index, learn how it is calculated, compare it with Entropy, and see how Decision Trees use it to build classification models.

What is the Gini Index?

The Gini Index is a measure of impurity in a dataset.

It tells us:

How Mixed
The Classes Are

A pure dataset has a low Gini Index.

A highly mixed dataset has a high Gini Index.

Intuition Behind the Gini Index

Imagine a bag containing colored balls.

Case 1: Pure Bag

Red
Red
Red
Red
Red

If you randomly pick a ball:

You are always correct in guessing:

Red

Impurity:

Zero

Case 2: Mixed Bag

Red
Blue
Red
Blue
Red

Now there is uncertainty.

Impurity increases.

The Gini Index measures this uncertainty.

Understanding Purity

Pure Dataset:

Pass
Pass
Pass
Pass

Impurity:

Very Low

Mixed Dataset:

Pass
Fail
Pass
Fail

Impurity:

High

Why Decision Trees Need the Gini Index

Decision Trees try to create groups that are as pure as possible.

Example:

Before Split:

Pass
Fail
Pass
Fail

After Split:

Pass
Pass

Fail
Fail

The classes become cleaner.

Impurity decreases.

The Gini Index helps identify such splits.

Gini Index Formula

For a classification problem:

Gini=1pi2Gini=1-\sum p_i^2

Where:

  • pip_i = Probability of class ii

The formula measures class impurity.

Understanding the Formula

Focus on intuition first.

When one class dominates:

90%
10%

Impurity decreases.

When classes are balanced:

50%
50%

Impurity increases.

Example 1: Pure Dataset

Dataset:

Pass
Pass
Pass
Pass

Probabilities:

P(Pass)=1P(Pass)=1

Gini:

1(1)21-(1)^2 00

Interpretation

Gini:

00

means:

Perfect purity.

Example 2: Balanced Dataset

Dataset:

Pass
Fail
Pass
Fail

Probabilities:

P(Pass)=0.5P(Pass)=0.5 P(Fail)=0.5P(Fail)=0.5

Gini:

1(0.52+0.52)1-(0.5^2+0.5^2) 1(0.25+0.25)1-(0.25+0.25) 0.50.5

Interpretation

Gini:

0.50.5

Maximum impurity for binary classification.

Example 3: Mostly One Class

Dataset:

Pass
Pass
Pass
Pass
Fail

Probabilities:

0.80.8

and

0.20.2

Gini:

1(0.82+0.22)1-(0.8^2+0.2^2) 1(0.64+0.04)1-(0.64+0.04) 0.320.32

Lower than 0.5 because the dataset is more pure.

Gini Index Range

For binary classification:

Gini ValueMeaning
0Completely Pure
0.1Very Low Impurity
0.3Moderate Impurity
0.5Maximum Impurity

Visualizing Impurity

100%-0% → Gini = 0

90%-10% → Gini ≈ 0.18

80%-20% → Gini ≈ 0.32

50%-50% → Gini = 0.5

As classes become balanced, impurity increases.

How Decision Trees Use Gini Index

Suppose we have:

Age
Income
Credit Score

as candidate features.

For each feature:

  1. Perform a split.
  2. Calculate Gini impurity.
  3. Compute weighted impurity.
  4. Select the feature with the lowest impurity.

Weighted Gini Formula

After splitting:

Weighted Gini=ChildParent×Gini(Child)Weighted\ Gini=\sum\frac{|Child|}{|Parent|}\times Gini(Child)

The split producing the lowest weighted Gini is selected.

Example

Parent Dataset:

100 samples.

Split:

Child A:

80 samples

Gini:

0.2

Child B:

20 samples

Gini:

0.1

Weighted Gini:

80100(0.2)+20100(0.1)\frac{80}{100}(0.2) + \frac{20}{100}(0.1) 0.180.18

The Decision Tree compares this value with other possible splits.

Gini Reduction

Just as Entropy uses Information Gain,

Gini uses:

Impurity Reduction

Good splits significantly reduce impurity.

Example

Before Split:

Pass
Fail
Pass
Fail

Gini:

High.

After Split:

Pass
Pass

Fail
Fail

Gini:

0

Excellent split.

Gini Index vs Entropy

Both measure impurity.

However, they use different formulas.

Entropy Formula

Entropy=pilog2(pi)Entropy=-\sum p_i\log_2(p_i)

Gini Formula

Gini=1pi2Gini=1-\sum p_i^2

Comparison

PropertyEntropyGini
Measures ImpurityYesYes
Uses LogarithmsYesNo
Computational CostHigherLower
Common in ID3/C4.5YesNo
Common in CARTNoYes

Why CART Uses Gini

Gini is computationally simpler.

No logarithms are required.

This makes training slightly faster.

As a result:

Many modern implementations use:

criterion = "gini"

as the default.

Does Gini Perform Better Than Entropy?

In practice:

Performance differences are usually small.

Most datasets produce very similar trees.

Therefore:

There is rarely a universally better choice.

Example: Loan Approval

Features:

  • Income
  • Credit Score
  • Age

Target:

Approved
Rejected

The Decision Tree evaluates Gini impurity for each split and chooses the feature producing the purest groups.

Example: Spam Detection

Features:

  • Number of Links
  • Sender Reputation
  • Email Length

The split reducing impurity the most becomes the next node.

Multi-Class Gini Index

The Gini Index naturally extends to multiple classes.

Example:

Cat
Dog
Horse
Bird

Formula remains identical.

Simply include all class probabilities.

Python Example

Using Scikit-Learn:

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(
criterion="gini"
)

Train:

model.fit(X_train, y_train)

Using Entropy Instead

model = DecisionTreeClassifier(
criterion="entropy"
)

Both approaches are supported.

Real-World Applications

Medical Diagnosis

Creating disease prediction trees.

Credit Scoring

Loan approval systems.

Customer Churn Prediction

Identifying customers likely to leave.

Fraud Detection

Classifying suspicious transactions.

Marketing

Customer segmentation and targeting.

Common Mistakes

Assuming Gini Measures Accuracy

Gini measures impurity, not prediction accuracy.

Thinking Lower Gini Means Better Model

Lower Gini at a node is good.

Overall model quality still requires validation.

Assuming Entropy and Gini Produce Completely Different Trees

In practice, differences are often minor.

Best Practices

  • Understand impurity before calculations
  • Compare Gini and Entropy experimentally
  • Monitor overfitting
  • Use pruning when necessary
  • Validate on unseen data

Gini Index Summary

Dataset TypeGini
Completely Pure0
Mostly One ClassLow
Balanced ClassesHigh
Maximum Binary Impurity0.5

Decision Tree Splitting Workflow

  1. Calculate Gini impurity
  2. Evaluate possible splits
  3. Compute weighted impurity
  4. Select lowest impurity split
  5. Create child nodes
  6. Repeat recursively

Why the Gini Index is Important

The Gini Index is one of the most widely used impurity measures in Machine Learning and serves as the default splitting criterion in many Decision Tree implementations. It provides a simple yet effective way to measure how mixed a dataset is and helps Decision Trees create increasingly pure groups through recursive splitting.

Understanding the Gini Index is essential because it forms the foundation of CART Decision Trees, Random Forests, and many ensemble learning methods. By measuring impurity efficiently, it enables Decision Trees to build accurate and interpretable classification models.