UNION vs UNION ALL in MySQL
Introduction
Combining query results has two flavors with major performance differences.
UNION removes duplicates while UNION ALL keeps all rows.
They are widely used for:
- Data merging with/without deduplication
- Performance optimization
- Report generation
What is UNION vs UNION ALL
UNION = Combine + Remove duplicates
UNION ALL = Combine + Keep all rows (faster)
Basic Syntax Comparison
sql
-- UNION (removes duplicates)
SELECT name FROM Students
UNION
SELECT name FROM Graduates;
-- UNION ALL (keeps duplicates)
SELECT name FROM Students
UNION ALL
SELECT name FROM Graduates;
Example
sql
-- Students table: Amit, Rahul, Amit
-- Graduates table: Rahul, Priya
-- UNION result: Amit, Rahul, Priya (3 rows)
-- UNION ALL result: Amit, Rahul, Amit, Rahul, Priya (5 rows)
Performance Difference
UNION ALL is 2-10x faster than UNION.
sql
-- UNION: Sort + Deduplicate (expensive)
SELECT * FROM Table1 UNION SELECT * FROM Table2;
-- UNION ALL: Direct concatenate (fast)
SELECT * FROM Table1 UNION ALL SELECT * FROM Table2;
When to Use UNION
Deduplication required.
sql
SELECT DISTINCT city FROM Students
UNION
SELECT DISTINCT city FROM Teachers;
When to Use UNION ALL
Duplicates OK or performance critical.
sql
-- Log analysis (duplicates expected)
SELECT event, timestamp FROM TodayLogs
UNION ALL
SELECT event, timestamp FROM YesterdayLogs;
UNION ALL with ORDER BY
Sort applies to the combined result.
sql
SELECT name, marks FROM Students WHERE marks > 90
UNION ALL
SELECT name, score FROM Graduates WHERE score > 90
ORDER BY marks DESC;
Memory Usage Comparison
| Operator | Sort Buffer | Memory | Speed |
| UNION | Yes | High | Slow |
| UNION ALL | No | Low | Fast |
UNION with Multiple Queries
Mixing behaviours.
sql
SELECT name FROM Students
UNION ALL
SELECT name FROM Graduates
UNION
SELECT name FROM Alumni;
UNION in Subqueries
Performance matters in nested queries.
sql
SELECT * FROM Reports
WHERE category IN (
SELECT category FROM ActiveCats
UNION ALL
SELECT category FROM SpecialCats
);
UNION with LIMIT
Limit the final combined result.
sql
SELECT * FROM Students WHERE marks > 90
UNION ALL
SELECT * FROM Graduates WHERE score > 90
ORDER BY marks DESC
LIMIT 10;
Execution Order
UNION: Each SELECT → Sort/Dedupe → Combine → ORDER BY → LIMIT
UNION ALL: Each SELECT → Direct Combine → ORDER BY → LIMIT
Performance Considerations
- 90% cases: Use UNION ALL unless deduplication is needed
- Large datasets: UNION may spill to disk
- Indexes work on individual SELECTs
- UNION ALL avoids sort buffer usage
Important Notes
- UNION = UNION DISTINCT
- UNION ALL never removes duplicates
- ORDER BY applies to the final result only
- Column count/types must match
- UNION is slower even with no duplicates
Example Scenario
Daily sales report (duplicates expected):
sql
SELECT 'Online' as source, product, quantity, sale_date
FROM OnlineSales WHERE sale_date = CURDATE()
UNION ALL
SELECT 'Store' as source, product, quantity, sale_date
FROM StoreSales WHERE sale_date = CURDATE()
ORDER BY product, quantity DESC;
Common Mistakes
- Using UNION when UNION ALL is sufficient (slow)
- Expecting UNION ALL to deduplicate
- ORDER BY in individual SELECTs (ignored)
- Large tables with unnecessary UNION
- Mismatched column types
Key Points to Remember
- UNION ALL faster (no deduplication)
- UNION removes duplicates (sort overhead)
- Use UNION ALL unless duplicates are harmful
- ORDER BY on the final result only
- Same column requirements for both
- Performance difference grows with data size