Random Forest in Machine Learning

Last updated: Jun 14, 2026

Author :

Christy Harshitha Dakarapu

In the previous article, we learned about Bagging (Bootstrap Aggregating), where multiple models are trained on different bootstrap samples and their predictions are combined.

We also discovered that Decision Trees benefit greatly from Bagging because they tend to have high variance.

This naturally leads to one of the most successful Machine Learning algorithms ever created:

Random Forest

Random Forest builds upon Bagging and introduces an additional idea:


Random Feature Selection

This simple improvement makes Random Forest more diverse, more robust, and often more accurate than a standard bagged collection of Decision Trees.

Today, Random Forest is widely used in:

Finance
Healthcare
Fraud Detection
Recommendation Systems
Customer Analytics

because it provides excellent performance with relatively little tuning.

What is Random Forest?

Random Forest is an ensemble learning algorithm that combines multiple Decision Trees and aggregates their predictions.

The name comes from:


Many Decision Trees
         ↓
Forest

Instead of relying on a single tree:


One Tree
      ↓
Prediction

Random Forest uses:


Tree 1
Tree 2
Tree 3
Tree 4
...
Tree N
      ↓
Combined Prediction

Why Not Use One Decision Tree?

Decision Trees have a major weakness:


High Variance

Small changes in data can produce very different trees.

Example:

Dataset A:


Tree A

Dataset B:


Tree B

Predictions may differ significantly.

Random Forest reduces this instability.

Core Idea Behind Random Forest

Random Forest combines:


Bagging
      +
Random Feature Selection

This creates a collection of diverse trees.

Diverse trees make different mistakes.

Combining them improves overall performance.

Step 1: Bootstrap Sampling

Random Forest starts by creating multiple bootstrap datasets.

Original Dataset:


A B C D E

Bootstrap Sample 1:


A C C D E

Bootstrap Sample 2:


B B C E D

Bootstrap Sample 3:


A A D E C

Each tree receives a different dataset.

Step 2: Train Multiple Trees

Each bootstrap dataset trains a separate Decision Tree.

Workflow:


Dataset 1 → Tree 1

Dataset 2 → Tree 2

Dataset 3 → Tree 3

So far, this is standard Bagging.

Step 3: Random Feature Selection

This is the key innovation.

Suppose we have:


Age
Salary
Experience
Education
Credit Score

Five features.

When building a split:

A normal Decision Tree considers:


All Features

Random Forest considers:


Random Subset

Example:


Salary
Education

Only these features compete for the split.

Why Random Feature Selection Helps

Without randomness:

Many trees choose the same feature repeatedly.

Example:


Credit Score

becomes the root node in every tree.

Trees become highly similar.

Random feature selection forces diversity.

Example

Tree 1:


Root = Credit Score

Tree 2:


Root = Salary

Tree 3:


Root = Education

Trees become less correlated.

Step 4: Prediction

Each tree generates a prediction.

Example:


Tree 1 → Fraud

Tree 2 → Fraud

Tree 3 → Genuine

Tree 4 → Fraud

Tree 5 → Fraud

Majority Voting

Votes:


Fraud = 4

Genuine = 1

Final Prediction:


Fraud

Regression in Random Forest

For regression:

Predictions are averaged.

Example:


Tree 1 → 45

Tree 2 → 50

Tree 3 → 55

Prediction:

\frac{45+50+55}{3} = 50

Random Forest Workflow


Original Dataset
       ↓
Bootstrap Samples
       ↓
Multiple Trees
       ↓
Random Features
       ↓
Predictions
       ↓
Voting / Averaging
       ↓
Final Output

Why Random Forest Works

Each tree sees:

Different data
Different features

Therefore:


Different Errors

Combining predictions reduces overall error.

Variance Reduction

Single Tree:


High Variance

Random Forest:


Lower Variance

because many trees are averaged together.

Example: Exam Prediction

Single Tree:


Accuracy = 78%

Random Forest:


Accuracy = 88%

Improvement occurs because multiple trees cooperate.

Out-of-Bag (OOB) Samples

Remember:

Bootstrap sampling leaves out some observations.

Example:

Original:


A B C D E

Bootstrap Sample:


A C C D E

Sample:

is excluded.

This becomes an:


Out-of-Bag Sample

OOB Evaluation

Out-of-Bag samples can estimate model performance.

Benefits:

No separate validation set required
Efficient evaluation
Built into Random Forest

Feature Importance

One major advantage of Random Forest:


Feature Importance

The algorithm estimates how useful each feature is.

Example:

Feature	Importance
Credit Score	0.42
Income	0.30
Age	0.18
Location	0.10

Higher importance means greater influence.

Classification Example

Predict:


Spam

Not Spam

Trees vote.

Majority class wins.

Regression Example

Predict:


House Price

Trees estimate prices.

Average prediction becomes final output.

Advantages of Random Forest

High Accuracy

Often performs well without extensive tuning.

Reduced Overfitting

More robust than a single Decision Tree.

Handles Non-Linear Relationships

Captures complex patterns.

Feature Importance

Provides useful insights.

Works with Large Datasets

Scales reasonably well.

Handles Missing Values Better

More tolerant than many algorithms.

Limitations of Random Forest

Reduced Interpretability

A forest of hundreds of trees is difficult to explain.

Increased Computational Cost

Training many trees requires more resources.

Larger Memory Usage

Many trees must be stored.

Slower Predictions

Compared to a single tree.

Random Forest vs Decision Tree

Decision Tree	Random Forest
Single Tree	Many Trees
High Variance	Lower Variance
Easier to Interpret	Harder to Interpret
Faster	Slower
More Overfitting Risk	Less Overfitting Risk

Random Forest vs Bagging

Bagging	Random Forest
Bootstrap Sampling	Bootstrap Sampling
Multiple Trees	Multiple Trees
Uses All Features	Uses Random Features
Less Diversity	More Diversity

Random Forest is essentially an improved version of Bagging.

Choosing Number of Trees

Parameter:


n_estimators

Example:


100 Trees

More trees generally improve stability but increase computation.

Choosing Maximum Depth

Parameter:


max_depth

Controls tree complexity.

Helps prevent overfitting.

Python Implementation

Import:


from sklearn.ensemble import RandomForestClassifier

Create Model:


model = RandomForestClassifier(
    n_estimators=100,
    random_state=42
)

Train:


model.fit(X_train, y_train)

Predict:


predictions = model.predict(X_test)

Out-of-Bag Evaluation


model = RandomForestClassifier(
    n_estimators=100,
    oob_score=True
)

View Score:


print(model.oob_score_)

Feature Importance


print(model.feature_importances_)

Real-World Applications

Healthcare

Disease diagnosis.

Finance

Credit scoring.

Fraud Detection

Transaction monitoring.

E-Commerce

Purchase prediction.

Marketing

Customer churn prediction.

Manufacturing

Equipment failure prediction.

Common Mistakes

Using Too Few Trees

Performance may become unstable.

Ignoring Hyperparameter Tuning

Depth and tree count matter.

Assuming Feature Importance Means Causation

Importance indicates usefulness, not causality.

Using Random Forest When Explainability is Critical

Single trees may be preferable.

Best Practices

Use sufficient trees
Monitor OOB score
Tune maximum depth
Analyze feature importance
Validate performance on unseen data

Random Forest Summary

Component	Purpose
Bootstrap Sampling	Dataset Diversity
Multiple Trees	Reduce Variance
Random Features	Tree Diversity
Voting	Classification
Averaging	Regression
OOB Samples	Validation

Random Forest Workflow Summary

Create bootstrap samples
Train multiple trees
Randomly select features at each split
Generate predictions
Vote or average
Produce final prediction
Evaluate performance

Why Random Forest is Important

Random Forest is one of the most widely used Machine Learning algorithms because it combines the simplicity of Decision Trees with the power of ensemble learning. By training many diverse trees and aggregating their predictions, it achieves strong performance while reducing overfitting and improving robustness.

Its ability to handle classification and regression tasks, provide feature importance estimates, and perform well with minimal tuning has made it a standard tool in both industry and research.

In the next article, we will study Boosting Intuition, a fundamentally different ensemble technique where models are trained sequentially and each new model focuses on correcting the mistakes made by previous models.