Self Join in MySQL
Introduction
Tables containing parent-child relationships need special join syntax.
Self Join joins a table to itself using table aliases.
It is widely used for:
- Hierarchical data (org charts)
- Employee-manager relationships
- Bill of materials
- Category trees
What is Self Join
Self Join treats same table as two logical tables using aliases.
Requires table aliases (cannot join table to itself without).
Basic Syntax
sql
SELECT t1.column, t2.column
FROM table AS t1
INNER JOIN table AS t2 ON t1.key = t2.key;
Example
sql
SELECT e1.name as employee, e2.name as manager
FROM Employees e1
INNER JOIN Employees e2 ON e1.manager_id = e2.id;
Shows each employee with their manager's name.
Self Join with WHERE
Filter hierarchical relationships.
sql
SELECT e1.name, e1.salary, e2.name as manager
FROM Employees e1
INNER JOIN Employees e2 ON e1.manager_id = e2.id
WHERE e1.salary > 50000;
Multiple Self Joins
Multi-level hierarchy.
sql
SELECT
e1.name as employee,
e2.name as manager,
e3.name as director
FROM Employees e1
INNER JOIN Employees e2 ON e1.manager_id = e2.id
INNER JOIN Employees e3 ON e2.manager_id = e3.id;
Self Join with LEFT JOIN
Include employees without managers.
sql
SELECT e1.name, COALESCE(e2.name, 'No Manager') as manager
FROM Employees e1
LEFT JOIN Employees e2 ON e1.manager_id = e2.id;
Self Join with ORDER BY
Hierarchical sorting.
sql
SELECT e1.name, e2.name as manager
FROM Employees e1
INNER JOIN Employees e2 ON e1.manager_id = e2.id
ORDER BY e2.name, e1.name;
Self Join with GROUP BY
Manager headcount reports.
sql
SELECT e2.name as manager, COUNT(e1.id) as team_size
FROM Employees e1
INNER JOIN Employees e2 ON e1.manager_id = e2.id
GROUP BY e2.id, e2.name
ORDER BY team_size DESC;
Category Tree Self Join
Parent-child categories.
sql
SELECT p.category_name as parent, c.category_name as child
FROM Categories c
INNER JOIN Categories p ON c.parent_id = p.id
WHERE p.status = 'Active';
Bill of Materials Self-join
Assembly hierarchies.
sql
SELECT parent.part_name, child.part_name, child.quantity
FROM Parts child
INNER JOIN Parts parent ON child.parent_part_id = parent.id;
Self Join Multiple Conditions
Complex relationship matching.
sql
SELECT e1.name, e1.department, e2.name
FROM Employees e1
INNER JOIN Employees e2 ON e1.manager_id = e2.id
AND e1.department = e2.department;
Performance with Self Join
Index foreign key columns.
sql
-- Critical indexes
CREATE INDEX idx_manager_id ON Employees(manager_id);
CREATE INDEX idx_id ON Employees(id);
Self Join Execution Order
FROM → SELF JOIN → ON → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
Performance Considerations
- Index self-join columns (both sides)
- LEFT JOIN for optional parents
- Avoid functions on join columns
- Large hierarchies need careful planning
Important Notes
- Table aliases MANDATORY
- Same table joined multiple times needs unique aliases
- Cycles create infinite loops (use CTEs for deep hierarchies)
- Works with INNER, LEFT, RIGHT joins
- Common for normalized hierarchical data
Example Scenario
Complete employee hierarchy report:
sql
SELECT
e1.name as employee,
e1.position,
e2.name as manager,
e2.position as manager_position,
COUNT(e3.id) as direct_reports
FROM Employees e1
INNER JOIN Employees e2 ON e1.manager_id = e2.id
LEFT JOIN Employees e3 ON e1.id = e3.manager_id
GROUP BY e1.id, e1.name, e1.position, e2.name, e2.position
ORDER BY e2.name, e1.name;
Common Mistakes
- Forgetting table aliases
- Non-indexed foreign key columns
- Circular references (infinite loops)
- Functions preventing index usage
- Wrong join type for hierarchy
Key Points to Remember
- Self Join requires table aliases
- Perfect for hierarchical relationships
- Index foreign key columns always
- LEFT JOIN for optional parents
- Multiple self joins for deep hierarchies
- Common patterns: employee-manager, categories