t-SNE (t-Distributed Stochastic Neighbor Embedding)

Last updated: Jun 16, 2026

Author :

Christy Harshitha Dakarapu

In the previous article, we learned about PCA (Principal Component Analysis) and how it reduces dimensionality by preserving maximum variance.

However, PCA has a limitation:


Linear Relationships Only

Many real-world datasets contain:


Complex Non-Linear Patterns

that PCA may fail to capture.

To solve this problem, we use:


t-SNE

which is one of the most popular techniques for visualizing high-dimensional data.

What is t-SNE?

t-SNE (t-Distributed Stochastic Neighbor Embedding) is a nonlinear dimensionality reduction technique primarily used for visualizing high-dimensional data in two or three dimensions.

Its goal is:


Keep Similar Points Close

Keep Dissimilar Points Apart

while reducing dimensions.

Why Was t-SNE Created?

Imagine a dataset with:


100 Features


1000 Features

Humans cannot visualize such data.

t-SNE helps transform:


100 Dimensions
       ↓
2 Dimensions

while preserving neighborhood relationships.

Real-Life Analogy

Imagine a group of friends.

People with similar interests naturally form groups.

Example:


Sports Fans

Movie Fans

Gamers

If we place everyone on a map:

Friends should stay close together.

Different groups should remain separated.

This is exactly what t-SNE tries to do.

The Core Idea

t-SNE focuses on:


Local Relationships

rather than global structure.

It tries to preserve:


Nearest Neighbors

from the original dataset.

Example

Suppose:


Student A

is most similar to:


Student B

and


Student C

In the lower-dimensional representation:


A, B, C

should remain close together.

Why PCA Sometimes Fails

Consider a dataset shaped like:


Spiral


Curved Manifold

PCA attempts to fit straight directions.

Important structure may be lost.

Example

Original data:


Curved Shape

PCA:


Flattened Representation

t-SNE:


Preserves Groups

more effectively.

Understanding Similarity

t-SNE begins by measuring similarity between points.

Example:


Point A
Point B

Very close:


High Similarity

Far apart:


Low Similarity

Step 1: High-Dimensional Similarities

In the original space:

t-SNE computes probabilities representing:


How Likely
Points Are Neighbors

Example

Pair	Similarity
A-B	0.80
A-C	0.15
A-D	0.05

A and B are highly similar.

Step 2: Low-Dimensional Mapping

Points are randomly placed in:


2D Space


3D Space

Step 3: Compare Similarities

t-SNE checks whether neighbor relationships are preserved.

If not:


Move Points

closer or farther apart.

Step 4: Repeat

The algorithm continuously adjusts point positions.

Goal:


Match Neighborhood Structure

between high-dimensional and low-dimensional spaces.

Why "t" in t-SNE?

The "t" refers to:


Student's t-Distribution

used in the low-dimensional space.

Why Use a t-Distribution?

Early versions of SNE suffered from:


Crowding Problem

Points became compressed together.

The t-distribution helps:


Spread Clusters Apart

making visualization clearer.

What Does t-SNE Output Look Like?

Suppose we have images of handwritten digits.

Original dimensions:


784 Features

After t-SNE:


2 Features

Visualization may show:


Cluster of 0s

Cluster of 1s

Cluster of 2s

and so on.

Example: Customer Segmentation

Features:

Age
Income
Spending Score
Purchase History

t-SNE can reveal:


Natural Customer Groups

visually.

Example: Face Recognition

Thousands of pixel features become:


2D Visualization

showing similar faces grouped together.

Example: Bioinformatics

Gene expression datasets often contain:


Thousands of Dimensions

t-SNE helps visualize biological clusters.

Important Parameter: Perplexity

Perplexity controls:


Neighborhood Size

Typical values:

Small Perplexity

Focuses on:


Very Local Structure

Large Perplexity

Captures:


Broader Relationships

Advantages of t-SNE

Excellent Visualization

One of its biggest strengths.

Captures Nonlinear Structure

Handles complex patterns.

Preserves Local Relationships

Keeps neighbors together.

Reveals Hidden Clusters

Useful for exploratory analysis.

Limitations of t-SNE

Computationally Expensive

Slow on very large datasets.

Primarily a Visualization Tool

Not usually used as a preprocessing step for predictive models.

Results Can Vary

Different runs may produce different layouts.

Difficult to Interpret Distances

Global distances are not always meaningful.

Important Warning

Many beginners assume:


Large Gap Between Clusters
=
Large Real Difference

This is not always true.

t-SNE prioritizes local neighborhoods, not exact global distances.

PCA vs t-SNE

Feature	PCA	t-SNE
Type	Linear	Nonlinear
Speed	Faster	Slower
Visualization	Good	Excellent
Local Structure	Moderate	Excellent
Large Datasets	Better	More Expensive
Interpretability	Higher	Lower

Example

Dataset:


100 Features

PCA:


Captures Variance

t-SNE:


Captures Neighborhoods

Python Implementation

Import:


from sklearn.manifold import TSNE

Create Model:


tsne = TSNE(
    n_components=2,
    perplexity=30,
    random_state=42
)

Transform Data:


X_tsne = tsne.fit_transform(X)

Visualize:


import matplotlib.pyplot as plt

plt.scatter(
    X_tsne[:,0],
    X_tsne[:,1]
)
plt.show()

Common Mistakes

Using t-SNE for Feature Selection

t-SNE is mainly for visualization.

Interpreting Global Distances

Far-apart clusters may not represent true distances.

Using Default Parameters Blindly

Perplexity significantly affects results.

Applying to Massive Datasets

Training can become slow.

Best Practices

Standardize data before t-SNE
Use PCA first for very high-dimensional data
Experiment with perplexity values
Focus on cluster structure rather than exact distances
Use t-SNE mainly for visualization

t-SNE Workflow


High-Dimensional Data
          ↓
Compute Similarities
          ↓
Map to 2D/3D
          ↓
Preserve Neighbors
          ↓
Visualization

t-SNE Summary

Concept	Meaning
t-SNE	Nonlinear Dimensionality Reduction
Goal	Preserve Neighborhoods
Output	2D or 3D Visualization
Perplexity	Neighborhood Size
Strength	Cluster Visualization

Why t-SNE is Important

t-SNE revolutionized the visualization of high-dimensional data by allowing complex structures to be displayed in two or three dimensions while preserving local relationships. It is widely used in machine learning, bioinformatics, image analysis, and exploratory data analysis to uncover hidden patterns and clusters.

Unlike PCA, which focuses on preserving variance, t-SNE focuses on preserving neighborhood relationships, making it one of the most effective tools for understanding complex datasets visually.