After building a regression model, one important question remains:
How good is the model?
Suppose two models predict house prices.
Model A:
| Actual Price | Predicted Price |
|---|---|
| 50 | 49 |
| 80 | 81 |
| 100 | 98 |
Model B:
| Actual Price | Predicted Price |
|---|---|
| 50 | 30 |
| 80 | 120 |
| 100 | 70 |
Clearly, Model A performs better.
However, computers need numerical measures to compare models objectively.
These measures are called Evaluation Metrics.
Evaluation Metrics help us:
- Measure prediction quality
- Compare different models
- Detect poor performance
- Select the best model
- Improve future predictions
In this article, we will explore the most important regression evaluation metrics used in Machine Learning and understand when to use each one.
Why Do We Need Evaluation Metrics?
Machine Learning models make predictions.
Predictions are rarely perfect.
Example:
| Actual | Predicted |
|---|---|
| 100 | 90 |
Prediction Error:
A single prediction error is useful.
However, real datasets contain thousands of predictions.
We need a systematic way to summarize overall performance.
Evaluation metrics provide this summary.
Understanding Residuals
Residuals represent prediction errors.
Formula:
Example:
Actual:
Predicted:
Residual:
Most regression metrics are built using residuals.
Desired Characteristics of a Good Metric
A good evaluation metric should:
- Reflect prediction quality
- Penalize mistakes appropriately
- Be easy to interpret
- Support model comparison
Mean Absolute Error (MAE)
MAE is one of the simplest regression metrics.
It calculates the average absolute prediction error.
Formula:
Example Calculation
Dataset:
| Actual | Predicted |
|---|---|
| 10 | 8 |
| 20 | 18 |
| 30 | 35 |
Errors:
| Error |
|---|
| 2 |
| 2 |
| 5 |
MAE:
Interpreting MAE
MAE = 3
means:
The model is off by approximately 3 units on average.
Advantages of MAE
- Easy to understand
- Same units as target variable
- Less sensitive to outliers
Disadvantages of MAE
- Does not penalize large errors strongly
- All errors are treated equally
Mean Squared Error (MSE)
MSE is one of the most commonly used regression metrics.
Instead of absolute errors, it uses squared errors.
Formula:
Example Calculation
Errors:
Squared Errors:
MSE:
Why Square Errors?
Squaring:
- Eliminates negative signs
- Penalizes large errors heavily
Example:
| Error | Squared Error |
|---|---|
| 2 | 4 |
| 10 | 100 |
Large mistakes receive significantly larger penalties.
Advantages of MSE
- Strongly penalizes large errors
- Smooth mathematical properties
- Works well with optimization algorithms
Disadvantages of MSE
- Sensitive to outliers
- Harder to interpret due to squared units
Root Mean Squared Error (RMSE)
RMSE solves one major problem of MSE.
Formula:
Example
Suppose:
Then:
Why RMSE is Popular
MAE:
Units = Original Target Units
RMSE:
Units = Original Target Units
MSE:
Units = Squared Units
RMSE is easier to interpret.
RMSE Interpretation
Example:
House Prices:
RMSE = ₹2 Lakhs
Interpretation:
Predictions are typically off by around ₹2 Lakhs.
Comparing MAE and RMSE
| Metric | Large Error Penalty |
|---|---|
| MAE | Moderate |
| RMSE | Strong |
If large mistakes are costly:
RMSE is often preferred.
R² Score (Coefficient of Determination)
One of the most important regression metrics.
R² measures:
How much variance in the target variable is explained by the model.
Formula:
Where:
- = Residual Sum of Squares
- = Total Sum of Squares
Understanding R² Intuitively
Suppose:
House prices vary significantly.
If the model explains most of this variation:
R² becomes high.
If the model explains little:
R² becomes low.
R² Range
Typically:
Interpretation:
| R² Value | Meaning |
|---|---|
| 0 | Explains nothing |
| 0.5 | Explains 50% variance |
| 0.8 | Explains 80% variance |
| 1 | Perfect prediction |
Example
Interpretation:
The model explains 85% of the variability in the target variable.
Why R² is Popular
Advantages:
- Easy interpretation
- Scale-independent
- Useful for comparison
Limitation of R²
R² almost always increases when new features are added.
Even useless features can increase R² slightly.
This creates a problem.
Adjusted R²
Adjusted R² solves this issue.
Formula:
Where:
- = Number of observations
- = Number of features
Why Adjusted R² Matters
Adjusted R² penalizes unnecessary features.
Useful for:
- Multiple Linear Regression
- Feature selection
Example
Model A:
5 Features
Adjusted R² = 0.82
Model B:
20 Features
Adjusted R² = 0.78
Even though Model B has more features, Model A may actually be better.
Mean Absolute Percentage Error (MAPE)
MAPE expresses error as a percentage.
Formula:
Example
Actual:
100
Predicted:
90
Percentage Error:
Why Businesses Like MAPE
Easy interpretation.
Example:
MAPE = 5%
means:
Predictions are off by about 5% on average.
Limitation of MAPE
Problems occur when:
Division by zero becomes undefined.
Explained Variance Score
Measures how much variance is captured by the model.
Formula:
focuses on variance rather than exact prediction error.
Values closer to:
are better.
Python Implementation
MAE
from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(
y_true,
y_pred
)
MSE
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(
y_true,
y_pred
)
RMSE
import numpy as np
rmse = np.sqrt(
mean_squared_error(
y_true,
y_pred
)
)
R² Score
from sklearn.metrics import r2_score
r2 = r2_score(
y_true,
y_pred
)
Example Comparison
Suppose:
Model A:
| Metric | Value |
|---|---|
| MAE | 3 |
| RMSE | 5 |
| R² | 0.85 |
Model B:
| Metric | Value |
|---|---|
| MAE | 7 |
| RMSE | 10 |
| R² | 0.60 |
Model A is clearly superior.
When to Use MAE
Choose MAE when:
- Interpretability matters
- Outliers should not dominate
- Average error is important
Examples:
- Sales Forecasting
- Demand Prediction
When to Use RMSE
Choose RMSE when:
- Large errors are costly
- Outliers matter
Examples:
- Medical Predictions
- Financial Forecasting
When to Use R²
Choose R² when:
- Comparing regression models
- Understanding explanatory power
Real-World Example
House Price Prediction:
Actual:
₹50 Lakhs
Predicted:
₹48 Lakhs
Error:
₹2 Lakhs
After evaluating thousands of houses:
| Metric | Value |
|---|---|
| MAE | ₹1.8 Lakhs |
| RMSE | ₹2.5 Lakhs |
| R² | 0.89 |
Interpretation:
The model performs quite well.
Common Mistakes
Using Only One Metric
No single metric tells the entire story.
Always evaluate multiple metrics.
Comparing Metrics Across Different Datasets
MAE and RMSE depend on scale.
Comparisons should be made carefully.
Ignoring Business Context
A 5% error may be acceptable in sales forecasting but unacceptable in medical applications.
Best Practices
- Report MAE and RMSE together
- Use R² for model comparison
- Use Adjusted R² for multiple regression
- Understand business requirements
- Evaluate on unseen test data
Regression Evaluation Workflow
A typical workflow is:
- Train model
- Generate predictions
- Calculate residuals
- Compute MAE
- Compute RMSE
- Compute R²
- Compare models
- Select the best-performing model
Summary of Regression Metrics
| Metric | Lower is Better? | Higher is Better? |
|---|---|---|
| MAE | Yes | No |
| MSE | Yes | No |
| RMSE | Yes | No |
| R² | No | Yes |
| Adjusted R² | No | Yes |
| MAPE | Yes | No |
Why Evaluation Metrics Matter
Building a regression model is only half the task. The other half is determining whether the model is actually useful. Evaluation metrics provide objective ways to measure performance, compare models, and identify areas for improvement.
Understanding MAE, MSE, RMSE, R², Adjusted R², and MAPE is essential because these metrics are used in nearly every real-world regression project. They help transform model predictions into meaningful performance insights and guide the development of better Machine Learning solutions.
In the next article, we will explore Polynomial Regression, which extends Linear Regression to handle non-linear relationships that cannot be captured by a simple straight line.