Advanced Subqueries in MySQL
In the previous article, you learned the basics of subqueries:
- Single-row subqueries
- Multi-row subqueries
- IN and EXISTS
- Subqueries in WHERE
Now we move to advanced usage.
These concepts are slightly more complex but very important for real-world SQL.
All examples work in MySQL.
1. Subqueries in SELECT Clause
A subquery can be written inside the SELECT statement to create calculated columns.
Example
Display each student along with the overall average age.
SELECT name,
(SELECT AVG(age) FROM Students) AS average_age
FROM Students;
What happens:
- The inner query calculates the average age.
- The outer query shows each student.
- The calculated value appears in every row.
This type of subquery is useful for adding calculated values to results.
2. Subqueries in FROM Clause (Derived Tables)
A subquery inside the FROM clause creates a temporary result set.
This is called a derived table.
Important rule:
An alias is mandatory.
Example
Find departments where the average age is greater than 20.
SELECT dept_id, avg_age
FROM
(SELECT dept_id, AVG(age) AS avg_age
FROM Students
GROUP BY dept_id) AS dept_average
WHERE avg_age > 20;
Execution flow:
- Inner query groups students by department.
- Outer query filters the grouped results.
Derived tables are useful when:
- You need to filter aggregated results.
- You want to break complex logic into steps.
3. Correlated Subqueries
A correlated subquery depends on values from the outer query.
It runs once for each row of the outer query.
Example
Find students whose age is greater than the average age of their department.
SELECT s.name, s.age, s.dept_id
FROM Students s
WHERE s.age >
(SELECT AVG(age)
FROM Students
WHERE dept_id = s.dept_id);
Explanation:
- For each student row,
- The inner query calculates the average age for that student’s department.
- The comparison is performed.
Because it runs repeatedly, correlated subqueries can be slower.
4. EXISTS vs IN
You already learned the basic difference.
Now let’s look at practical behavior.
Using IN
SELECT name
FROM Students
WHERE dept_id IN
(SELECT dept_id FROM Departments);
IN:
- Retrieves all dept_id values first.
- Then compares.
Using EXISTS
SELECT name
FROM Students s
WHERE EXISTS
(SELECT 1
FROM Departments d
WHERE d.dept_id = s.dept_id);
EXISTS:
- Checks row by row.
- Stops when a match is found.
General guideline:
- Use IN for small datasets.
- Use EXISTS for large datasets.
- EXISTS is often better with correlated subqueries.
5. Join vs Subquery
Both can produce similar results.
Using Subquery
SELECT name
FROM Students
WHERE dept_id IN
(SELECT dept_id
FROM Departments
WHERE dept_name = 'Computer');
Using JOIN
SELECT Students. name
FROM Students
INNER JOIN Departments
ON Students.dept_id = Departments.dept_id
WHERE Departments.dept_name = 'Computer';
Comparison:
| Subquery | Join |
| Easier for logical thinking | Better for relational queries |
| Can be slower | Usually faster |
| Good for filtering | Good for combining tables |
In most real-world relational cases, JOIN is preferred.
6. When NOT to Use Subqueries
Avoid subqueries when:
- A JOIN can perform the same task more efficiently.
- The query becomes deeply nested and hard to read.
- Performance is critical,l anthe d data size is large.
- Correlated subqueries cause repeated execution.
Overusing subqueries makes code difficult to maintain.
7. Common Advanced Mistakes
- Forgetting the alias in derived tables.
- Writing correlated subqueries unnecessarily.
- Nesting too many levels of subqueries.
- Ignoring performance impact.
- Confusing correlated and non-correlated subqueries.
Final Summary
In this article, you learned:
- Subqueries in SELECT
- Subqueries in FROM (Derived Tables)
- Correlated subqueries
- EXISTS vs IN (performance view)
- Join vs Subquery
- When not to use subqueries