GROUP BY Clause in MySQL

Introduction
Grouping data helps analyse patterns and summaries from large datasets.
The GROUP BY clause in MySQL groups rows with identical values into summary rows.
It is widely used for:

  • Aggregate calculations (COUNT, SUM, AVG, MAX, MIN)
  • Data analysis and reporting
  • Dashboard statistics

What is the GROUP BY Clause
The GROUP BY clause groups rows that have the same values in specified columns into summary rows.
Aggregate functions like COUNT, SUM, and AVG are applied to each group.

Basic Syntax

sql

SELECT column_name, AGGREGATE_FUNCTION(column

FROM table_name 

GROUP BY column_name; 

Example

sql

SELECT city, COUNT(*) as student_count 

FROM Students 

GROUP BY city; 

This returns the number of students per city.

GROUP BY with Multiple Columns
Groups by a combination of columns.

sql

SELECT city, department, AVG(marks) as avg_marks 

FROM Students 

GROUP BY city, department; 

Groups by city AND department combination.

Common Aggregate Functions

  • COUNT(*): Total rows in group
  • SUM(column): Total of numeric column
  • AVG(column): Average of numeric column
  • MAX(column): Highest value
  • MIN(column): Lowest value

Example with Multiple Aggregates

sql

SELECT city, 

       COUNT(*) as total_students, 

       AVG(marks) as avg_marks, 

       MAX(marks) as highest_mark 

FROM Students 

GROUP BY city; 

GROUP BY with WHERE Clause
WHERE filters before grouping.

sql

SELECT city, COUNT(*) as count 

FROM Students 

WHERE marks > 80 

GROUP BY city; 

GROUP BY with ORDER BY
Sort grouped results.

sql

SELECT city, AVG(marks) as avg_marks 

FROM Students 

GROUP BY city 

ORDER BY avg_marks DESC

GROUP BY with JOIN
Group joined table results.

sql

SELECT Courses.course_name, COUNT(Students.id) as enrollment 

FROM Students 

JOIN Courses ON Students.course_id = Courses.id 

GROUP BY Courses.course_name; 

GROUP BY with Subqueries
Use grouped results in a subquery.

sql

SELECT * FROM

    SELECT city, AVG(marks) as avg_marks 

    FROM Students 

    GROUP BY city 

) AS city_stats 

WHERE avg_marks > 75; 

GROUP BY with ROLLUP
ROLLUP adds subtotals and totals.

sql

SELECT city, department, AVG(marks) 

FROM Students 

GROUP BY city, department WITH ROLLUP

Shows city subtotals, department subtotals, and total.

Execution Order with GROUP BY
GROUP BY is applied after:
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
LIMIT

Performance Considerations

  • Index GROUP BY columns for faster grouping
  • Avoid GROUP BY on large text columns
  • Use WHERE to reduce rows before grouping
  • Temporary tables created for complex GROUP BY

Important Notes

  • All non-aggregate columns must appear in GROUP BY
  • HAVING filter groups (after GROUP BY)
  • ORDER BY can reference aliases
  • NULL values form their own group
  • MySQL allows SELECT * with GROUP BY (uses arbitrary row)

Example Scenario
Get average marks per city with total students:

sql

SELECT city, 

       COUNT(*) as total_students, 

       ROUND(AVG(marks), 2) as avg_marks 

FROM Students 

GROUP BY city 

ORDER BY avg_marks DESC

Common Mistakes

  • Forgetting GROUP BY for aggregate functions
  • Using WHERE instead of HAVING for aggregates
  • Non-grouped columns in SELECT without aggregate
  • GROUP BY before WHERE (wrong execution order)
  • Large datasets without proper indexing

Key Points to Remember

  • GROUP BY groups identical values into summary rows
  • Aggregate functions work on each group
  • WHERE filters before grouping, HAVING filters after
  • Multiple columns: groups by complete combination
  • ROLLUP adds subtotals and grand totals
  • Index GROUP BY columns for performance