Advanced Subqueries in MySQL

In the previous article, you learned the basics of subqueries:

  • Single-row subqueries
  • Multi-row subqueries
  • IN and EXISTS
  • Subqueries in WHERE

Now we move to advanced usage.

These concepts are slightly more complex but very important for real-world SQL.

All examples work in MySQL.


1. Subqueries in SELECT Clause

A subquery can be written inside the SELECT statement to create calculated columns.

Example

Display each student along with the overall average age.

SELECT name,

       (SELECT AVG(age) FROM Students) AS average_age

FROM Students;

What happens:

  1. The inner query calculates the average age.
  2. The outer query shows each student.
  3. The calculated value appears in every row.

This type of subquery is useful for adding calculated values to results.


2. Subqueries in FROM Clause (Derived Tables)

A subquery inside the FROM clause creates a temporary result set.

This is called a derived table.

Important rule:
An alias is mandatory.


Example

Find departments where the average age is greater than 20.

SELECT dept_id, avg_age

FROM

    (SELECT dept_id, AVG(age) AS avg_age

     FROM Students

     GROUP BY dept_id) AS dept_average

WHERE avg_age > 20;

Execution flow:

  1. Inner query groups students by department.
  2. Outer query filters the grouped results.

Derived tables are useful when:

  • You need to filter aggregated results.
  • You want to break complex logic into steps.

3. Correlated Subqueries

A correlated subquery depends on values from the outer query.

It runs once for each row of the outer query.


Example

Find students whose age is greater than the average age of their department.

SELECT s.name, s.age, s.dept_id

FROM Students s

WHERE s.age >

    (SELECT AVG(age)

     FROM Students

     WHERE dept_id = s.dept_id);

Explanation:

  • For each student row,
  • The inner query calculates the average age for that student’s department.
  • The comparison is performed.

Because it runs repeatedly, correlated subqueries can be slower.


4. EXISTS vs IN

You already learned the basic difference.

Now let’s look at practical behavior.

Using IN

SELECT name

FROM Students

WHERE dept_id IN

    (SELECT dept_id FROM Departments);

IN:

  • Retrieves all dept_id values first.
  • Then compares.

Using EXISTS

SELECT name

FROM Students s

WHERE EXISTS

    (SELECT 1

     FROM Departments d

     WHERE d.dept_id = s.dept_id);

EXISTS:

  • Checks row by row.
  • Stops when a match is found.

General guideline:

  • Use IN for small datasets.
  • Use EXISTS for large datasets.
  • EXISTS is often better with correlated subqueries.

5. Join vs Subquery

Both can produce similar results.

Using Subquery

SELECT name

FROM Students

WHERE dept_id IN

    (SELECT dept_id

     FROM Departments

     WHERE dept_name = 'Computer');

Using JOIN

SELECT Students. name

FROM Students

INNER JOIN Departments

ON Students.dept_id = Departments.dept_id

WHERE Departments.dept_name = 'Computer';

Comparison:

Subquery

Join

Easier for logical thinking

Better for relational queries

Can be slower

Usually faster

Good for filtering

Good for combining tables

In most real-world relational cases, JOIN is preferred.


6. When NOT to Use Subqueries

Avoid subqueries when:

  1. A JOIN can perform the same task more efficiently.
  2. The query becomes deeply nested and hard to read.
  3. Performance is critical,l anthe d data size is large.
  4. Correlated subqueries cause repeated execution.

Overusing subqueries makes code difficult to maintain.


7. Common Advanced Mistakes

  1. Forgetting the alias in derived tables.
  2. Writing correlated subqueries unnecessarily.
  3. Nesting too many levels of subqueries.
  4. Ignoring performance impact.
  5. Confusing correlated and non-correlated subqueries.

Final Summary

In this article, you learned:

  • Subqueries in SELECT
  • Subqueries in FROM (Derived Tables)
  • Correlated subqueries
  • EXISTS vs IN (performance view)
  • Join vs Subquery
  • When not to use subqueries