← Back to Lessons
INTERMEDIATE ⏱ 32 minutes

Aggregate Functions - Analyzing Your Data

Up until now, you've been working with individual rows: "Show me Alice's GPA" or "Find students in Biology." But what if you want to answer bigger questions like "What's the average GPA across all students?" or "How many students are enrolled?" That's where aggregate functions come in! They take many rows and collapse them into a single summary value. Think of them as your data analysis powerhouse!


πŸ“š What You'll Learn

1. What Are Aggregate Functions?

Aggregate functions perform calculations across multiple rows and return a single result. They "aggregate" (combine) data into summaries.

πŸ’‘ Real-World Example: Imagine you're a teacher with 100 students. Instead of looking at each student's grade individually, you want to know: "What's the class average?" or "What's the highest score?" or "How many students passed?" Aggregate functions answer these "big picture" questions by processing all rows at once and giving you one answer.

Visual Example: From Many Rows to One Summary

Input: Individual Rows

name gpa
Alice3.8
Bob3.5
Carol3.9
Diana3.2

Output: Single Summary

SELECT AVG(gpa) 
FROM students;
AVG(gpa)
3.6

See the magic? Four rows collapsed into one number: the average GPA!

2. The Big Five Aggregate Functions

SQL has five essential aggregate functions you'll use constantly:

Function What It Does Example Question
COUNT() Counts rows "How many students?"
SUM() Adds numbers "What's total sales?"
AVG() Calculates average "What's average GPA?"
MIN() Finds smallest value "What's lowest price?"
MAX() Finds largest value "What's highest score?"

3. COUNT() - How Many Rows?

COUNT() is the most commonly used aggregate function. It counts rows.

COUNT(*) - Count All Rows

-- How many students are in the database?
SELECT COUNT(*) AS total_students
FROM students;

Result: If there are 150 students, you get: total_students: 150

πŸ’‘ What COUNT(*) Means: The asterisk (*) means "count every row, regardless of content." Even if a row has NULL values in every column, COUNT(*) still counts it because the row exists.

COUNT(column) - Count Non-NULL Values

-- How many students provided a phone number?
SELECT COUNT(phone) AS students_with_phone
FROM students;

This counts only rows where phone is NOT NULL.

The Critical Difference

COUNT(*)

SELECT COUNT(*) 
FROM students;

Result: 4 (all rows)

COUNT(column)

SELECT COUNT(phone) 
FROM students;

Result: 3 (Bob has NULL phone)

⚠️ Critical Point: COUNT(*) and COUNT(column) give DIFFERENT results! This is one of the most common interview questions. COUNT(*) counts rows, COUNT(column) counts non-NULL values in that column.

COUNT(DISTINCT column) - Count Unique Values

-- How many different majors do we have?
SELECT COUNT(DISTINCT major) AS unique_majors
FROM students;

If 100 students are divided among 5 majors, this returns 5 (not 100).

4. SUM() - What's the Total?

SUM() adds up all values in a numeric column.

-- What are our total sales?
SELECT SUM(sales_amount) AS total_sales
FROM orders;

Example with Sample Data:

Order Data

order_id amount
1$250
2$175
3$500
4$325

SUM Result

SELECT SUM(amount) 
FROM orders;
SUM(amount)
$1,250

250 + 175 + 500 + 325 = 1,250

⚠️ Important: SUM() only works with numeric columns! Trying to SUM() text like names will cause an error.

5. AVG() - What's the Average?

AVG() calculates the average (mean) of numeric values.

-- What's the average GPA?
SELECT AVG(gpa) AS average_gpa
FROM students;

Example Calculation:

GPAs: 3.8, 3.5, 3.9, 3.2
Calculation: (3.8 + 3.5 + 3.9 + 3.2) Γ· 4 = 14.4 Γ· 4 = 3.6

How NULL Affects AVG()

AVG() ignores NULL values completely:

With NULL

GPAs: 3.8, 3.5, NULL, 3.9

AVG(): (3.8 + 3.5 + 3.9) Γ· 3 = 3.73

NULL is excluded from both sum and count!

Without NULL

GPAs: 3.8, 3.5, 3.2, 3.9

AVG(): (3.8 + 3.5 + 3.2 + 3.9) Γ· 4 = 3.6

All values included

6. MIN() and MAX() - Finding Extremes

MIN() finds the smallest value, MAX() finds the largest value.

Finding Minimum Values

-- What's the lowest GPA?
SELECT MIN(gpa) AS lowest_gpa
FROM students;

Finding Maximum Values

-- What's the highest price?
SELECT MAX(price) AS highest_price
FROM products;

MIN/MAX Work with Text Too!

-- First student alphabetically
SELECT MIN(name) AS first_name_alphabetically
FROM students;

-- Last student alphabetically
SELECT MAX(name) AS last_name_alphabetically
FROM students;

With names like Alice, Bob, Carol, Diana:
MIN(name) = "Alice" (first alphabetically)
MAX(name) = "Diana" (last alphabetically)

πŸ’‘ Pro Tip: MIN() and MAX() also ignore NULL values, just like SUM() and AVG().

7. Combining Multiple Aggregates in One Query

You can use multiple aggregate functions in a single SELECT statement:

-- Get complete GPA statistics
SELECT 
    COUNT(*) AS total_students,
    AVG(gpa) AS average_gpa,
    MIN(gpa) AS lowest_gpa,
    MAX(gpa) AS highest_gpa
FROM students;

Result:

total_students average_gpa lowest_gpa highest_gpa
150 3.45 2.1 4.0

One query, complete statistical summary! This is incredibly powerful for dashboards and reports.

8. Using WHERE to Filter Before Aggregating

WHERE filters rows before the aggregate function runs:

-- Average GPA of Biology majors only
SELECT AVG(gpa) AS bio_average_gpa
FROM students
WHERE major = 'Biology';

Execution Order:

  1. FROM: Get all students
  2. WHERE: Filter to Biology majors only
  3. SELECT AVG(): Calculate average of filtered rows

Practical Example: Sales Analysis

-- Total sales for 2024 only
SELECT 
    COUNT(*) AS orders_in_2024,
    SUM(amount) AS total_revenue
FROM orders
WHERE YEAR(order_date) = 2024;

9. The Mixing Problem (Without GROUP BY)

This is one of the most confusing things for beginners:

-- ❌ ERROR: Can't mix aggregate and non-aggregate columns!
SELECT name, AVG(gpa)
FROM students;

❌ Why This Fails: AVG(gpa) returns ONE number (the average). But name has 150 different values (one per student). SQL can't put 150 names next to 1 numberβ€”they're incompatible!

Think About It Logically:

Question: "Show me each student's name and the average GPA."
Problem: The average GPA is 3.6 for ALL students. Which name should appear next to it? Alice? Bob? All 150 names?
Answer: SQL doesn't know, so it throws an error!

The Solution: GROUP BY (Next Lesson!)

To mix aggregates with regular columns, you need GROUP BY. That's coming in the next lesson! For now, remember:

10. Common Beginner Mistakes

Mistake #1: Confusing COUNT(*) and COUNT(column)

-- These give DIFFERENT results!
SELECT COUNT(*) FROM students;     -- Counts all rows (4)
SELECT COUNT(phone) FROM students; -- Counts non-NULL phones (3)

Mistake #2: Using AVG() on Text

-- ❌ ERROR: Can't average text!
SELECT AVG(name) FROM students;

-- βœ… CORRECT: Average numeric values
SELECT AVG(gpa) FROM students;

Mistake #3: Forgetting NULL Affects Results

-- If there are 100 students but 20 have NULL GPA:
SELECT AVG(gpa) FROM students;
-- Averages only the 80 non-NULL GPAs (not 100!)

Mistake #4: Mixing Aggregates with Regular Columns

-- ❌ WRONG: Can't mix without GROUP BY
SELECT name, AVG(gpa) FROM students;

-- βœ… CORRECT: Just the aggregate
SELECT AVG(gpa) FROM students;

Mistake #5: Using Aggregate in WHERE

-- ❌ WRONG: Can't use aggregate in WHERE
SELECT * FROM students WHERE gpa > AVG(gpa);

-- βœ… CORRECT: Use HAVING (covered in next lesson)
-- Or calculate separately in subquery

11. 🎯 Hands-On Practice Exercises

Exercise 1: Count All Students

Task: Write a query to count the total number of students.

Show Answer
SELECT COUNT(*) AS total_students
FROM students;

Exercise 2: Calculate Average GPA

Task: Find the average GPA across all students.

Show Answer
SELECT AVG(gpa) AS average_gpa
FROM students;

Exercise 3: Find Highest and Lowest GPA

Task: Find both the highest and lowest GPAs in one query.

Show Answer
SELECT 
    MAX(gpa) AS highest_gpa,
    MIN(gpa) AS lowest_gpa
FROM students;

Exercise 4: Count Students with Email

Task: Count how many students have provided an email address.

Show Answer
SELECT COUNT(email) AS students_with_email
FROM students;

Remember: COUNT(email) excludes NULL values!

Exercise 5: Complete Statistics

Task: Get complete GPA statistics: count, average, min, and max in one query.

Show Answer
SELECT 
    COUNT(*) AS total_students,
    AVG(gpa) AS average_gpa,
    MIN(gpa) AS lowest_gpa,
    MAX(gpa) AS highest_gpa
FROM students;

Exercise 6: Filtered Aggregate

Task: Find the average GPA of Computer Science majors only.

Show Answer
SELECT AVG(gpa) AS cs_average_gpa
FROM students
WHERE major = 'Computer Science';

Challenge Exercise: COUNT Comparison

Task: Write a query that shows both total students and students with phone numbers.

Show Answer
SELECT 
    COUNT(*) AS total_students,
    COUNT(phone) AS students_with_phone,
    COUNT(*) - COUNT(phone) AS students_without_phone
FROM students;

Bonus: We calculated students WITHOUT phone by subtracting!

πŸ“ Key Takeaways

πŸš€ What's Next?

Excellent work! You now know how to summarize entire datasets into meaningful statistics. But what if you want to break down those summaries by category? For example, "average GPA per major" or "total sales per region"? That's where GROUP BY comes inβ€”it's one of the most powerful SQL features!

Up next: GROUP BY - Grouping Data for Analysis

← Back to Lessons
Practice Now β†’