Filtering Groups with HAVING
You've learned how WHERE filters individual rows and how GROUP BY creates groups. But what if you want to filter based on aggregate results? For example: "Show me only majors with an average GPA above 3.5" or "Find regions where total sales exceed $1 million." You can't use WHERE for this because WHERE happens before grouping! That's where HAVING comes inβit filters groups after aggregation.
π What You'll Learn
- What HAVING does and why it's essential
- The critical difference between WHERE and HAVING
- When to use WHERE vs HAVING (and when to use both!)
- How HAVING filters groups, not rows
- Using aggregate functions in HAVING conditions
- Combining WHERE and HAVING in the same query
- Multiple conditions in HAVING (AND, OR)
- The execution order: WHERE β GROUP BY β HAVING
- Common beginner mistakes and how to avoid them
- Real-world business analytics scenarios
- Hands-on practice exercises
1. What Does HAVING Do?
HAVING filters groups based on aggregate results. It answers questions like "Which groups meet this aggregate condition?"
π‘ Real-World Example: Imagine you're analyzing sales by region. You calculate total sales per region, but you only want to see regions that made over $500,000. You can't filter individual sales (that's WHERE), you need to filter the grouped totals (that's HAVING). It's like saying: "Show me the summary, but only for high performers."
Basic Syntax
SELECT column, AGGREGATE_FUNCTION(column)
FROM table_name
GROUP BY column
HAVING AGGREGATE_FUNCTION(column) condition;
Simple Example
-- Show only majors with average GPA above 3.5
SELECT
major,
AVG(gpa) AS avg_gpa
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.5;
2. WHERE vs HAVING - The Critical Difference
This is the most important concept to understand:
WHERE
- Filters: Individual rows
- When: Before grouping
- Can use: Regular columns
- Cannot use: Aggregates
- Example:
WHERE gpa > 3.0
Question it answers:
"Which individual students qualify?"
HAVING
- Filters: Groups
- When: After grouping
- Can use: Aggregates
- Can use: Grouped columns
- Example:
HAVING AVG(gpa) > 3.5
Question it answers:
"Which groups meet the aggregate condition?"
β οΈ Key Rule: If your condition involves an aggregate function (COUNT, SUM, AVG, MIN, MAX), you MUST use HAVING, not WHERE. If it's just a regular column comparison, use WHERE.
3. Visual Example: WHERE vs HAVING in Action
Scenario: Student GPA Analysis
Original Data (6 Students)
| name | major | gpa |
|---|---|---|
| Alice | CS | 3.8 |
| Bob | CS | 2.9 |
| Carol | Bio | 3.9 |
| Diana | Bio | 3.7 |
| Eve | Math | 3.2 |
| Frank | Math | 3.4 |
Query 1: Using WHERE (Filter Rows)
SELECT major, AVG(gpa) AS avg_gpa
FROM students
WHERE gpa > 3.0 -- Filter individual students FIRST
GROUP BY major;
Step 1 (WHERE): Remove Bob (2.9) and Eve (3.2) because their individual GPA β€ 3.0
Step 2 (GROUP BY): Group remaining 4 students
Result:
| major | avg_gpa |
|---|---|
| CS | 3.8 (only Alice) |
| Bio | 3.8 (Carol + Diana) |
| Math | 3.4 (only Frank) |
Query 2: Using HAVING (Filter Groups)
SELECT major, AVG(gpa) AS avg_gpa
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.5; -- Filter groups AFTER calculating average
Step 1 (GROUP BY): Group all 6 students by major
Step 2 (Calculate): CS avg = 3.35, Bio avg = 3.8, Math avg = 3.3
Step 3 (HAVING): Keep only groups where avg > 3.5
Result:
| major | avg_gpa |
|---|---|
| Bio | 3.8 |
CS and Math are excluded because their group averages don't meet the condition!
4. Common HAVING Conditions
COUNT: Filter by Group Size
-- Show majors with more than 50 students
SELECT major, COUNT(*) AS student_count
FROM students
GROUP BY major
HAVING COUNT(*) > 50;
SUM: Filter by Total
-- Show regions with total sales over $1 million
SELECT region, SUM(sales) AS total_sales
FROM orders
GROUP BY region
HAVING SUM(sales) > 1000000;
AVG: Filter by Average
-- Show products with average rating above 4.5
SELECT product_name, AVG(rating) AS avg_rating
FROM reviews
GROUP BY product_name
HAVING AVG(rating) > 4.5;
MIN/MAX: Filter by Extremes
-- Show categories where the cheapest product costs over $100
SELECT category, MIN(price) AS lowest_price
FROM products
GROUP BY category
HAVING MIN(price) > 100;
5. The Power Combo: WHERE + HAVING
You can (and often should) use WHERE and HAVING together! They work at different stages:
-- Find high-performing majors, but only look at recent students
SELECT
major,
AVG(gpa) AS avg_gpa,
COUNT(*) AS student_count
FROM students
WHERE year >= 2023 -- 1. Filter rows: only recent students
GROUP BY major -- 2. Group by major
HAVING AVG(gpa) > 3.5 -- 3. Filter groups: only high averages
AND COUNT(*) >= 10; -- 4. Filter groups: only substantial size
Execution Flow:
- FROM students β Get all student records
- WHERE year >= 2023 β Filter to students from 2023 onward (individual rows)
- GROUP BY major β Group filtered students by major
- Calculate AVG(gpa), COUNT(*) β Compute aggregates per group
- HAVING AVG(gpa) > 3.5 AND COUNT(*) >= 10 β Keep only groups meeting both conditions
- SELECT β Return final results
π‘ Pro Tip: Use WHERE to reduce data early (improves performance), then use HAVING to filter the grouped results. This is the most efficient approach!
6. Multiple Conditions in HAVING
Just like WHERE, you can use AND, OR in HAVING:
Using AND (Both Conditions Must Be True)
-- Majors with high average AND large enrollment
SELECT major, AVG(gpa) AS avg_gpa, COUNT(*) AS count
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.5 AND COUNT(*) > 30;
Using OR (Either Condition Can Be True)
-- Majors that are either high-performing OR very popular
SELECT major, AVG(gpa) AS avg_gpa, COUNT(*) AS count
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.8 OR COUNT(*) > 100;
Complex Conditions with Parentheses
-- High GPA with good size, OR very large regardless of GPA
SELECT major, AVG(gpa) AS avg_gpa, COUNT(*) AS count
FROM students
GROUP BY major
HAVING (AVG(gpa) > 3.5 AND COUNT(*) > 20)
OR COUNT(*) > 100;
7. Can HAVING Filter Non-Aggregate Columns?
Yes! HAVING can filter on columns that appear in GROUP BY:
-- These both work, but WHERE is more efficient
SELECT major, AVG(gpa)
FROM students
GROUP BY major
HAVING major LIKE 'C%'; -- Filter groups starting with 'C'
-- Better approach: Use WHERE
SELECT major, AVG(gpa)
FROM students
WHERE major LIKE 'C%' -- Filter rows earlier (more efficient)
GROUP BY major;
π‘ Best Practice: If a condition doesn't involve an aggregate, use WHERE instead of HAVING. It's more efficient because WHERE filters before grouping happens!
8. Common Mistakes with HAVING
Mistake #1: Using Aggregate in WHERE
-- β WRONG: Can't use aggregates in WHERE
SELECT major, AVG(gpa)
FROM students
WHERE AVG(gpa) > 3.5
GROUP BY major;
-- β
CORRECT: Use HAVING for aggregates
SELECT major, AVG(gpa)
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.5;
Mistake #2: Using HAVING Without GROUP BY
-- β WRONG: HAVING requires GROUP BY
SELECT AVG(gpa)
FROM students
HAVING AVG(gpa) > 3.5;
-- β
CORRECT: Use WHERE or add GROUP BY
SELECT AVG(gpa)
FROM students
-- No HAVING needed; just calculate the aggregate
Mistake #3: Filtering Rows with HAVING Instead of WHERE
-- β INEFFICIENT: HAVING on non-aggregate
SELECT major, AVG(gpa)
FROM students
GROUP BY major
HAVING major = 'Computer Science';
-- β
BETTER: Use WHERE (filters before grouping)
SELECT major, AVG(gpa)
FROM students
WHERE major = 'Computer Science'
GROUP BY major;
Mistake #4: Forgetting to Repeat Aggregate in HAVING
-- β WRONG: Can't use alias in HAVING (in most databases)
SELECT major, AVG(gpa) AS avg_gpa
FROM students
GROUP BY major
HAVING avg_gpa > 3.5; -- Doesn't work in MySQL strict mode!
-- β
CORRECT: Repeat the aggregate function
SELECT major, AVG(gpa) AS avg_gpa
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.5;
β οΈ Note: Some databases (like PostgreSQL) allow using the alias in HAVING, but MySQL and SQL Server require the full aggregate function. For portability, always use the full function!
Mistake #5: Wrong Execution Order Assumption
-- β WRONG: Thinking HAVING happens before GROUP BY
-- This doesn't make sense conceptually
SELECT major, AVG(gpa)
FROM students
HAVING AVG(gpa) > 3.5 -- Can't filter groups before they exist!
GROUP BY major;
-- β
CORRECT: HAVING always comes AFTER GROUP BY
SELECT major, AVG(gpa)
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.5;
9. π― Hands-On Practice Exercises
Exercise 1: Basic HAVING
Task: Find majors where the average GPA is above 3.6.
Show Answer
SELECT
major,
AVG(gpa) AS avg_gpa
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.6;
Exercise 2: COUNT in HAVING
Task: Show majors that have at least 25 students.
Show Answer
SELECT
major,
COUNT(*) AS student_count
FROM students
GROUP BY major
HAVING COUNT(*) >= 25;
Exercise 3: WHERE + HAVING
Task: Find the average GPA per major for students enrolled in 2023 or later, but only show majors with an average above 3.4.
Show Answer
SELECT
major,
AVG(gpa) AS avg_gpa
FROM students
WHERE year >= 2023
GROUP BY major
HAVING AVG(gpa) > 3.4;
Exercise 4: Multiple Conditions with AND
Task: Find majors where the average GPA is above 3.5 AND they have more than 30 students.
Show Answer
SELECT
major,
AVG(gpa) AS avg_gpa,
COUNT(*) AS student_count
FROM students
GROUP BY major
HAVING AVG(gpa) > 3.5 AND COUNT(*) > 30;
Exercise 5: MIN/MAX in HAVING
Task: Find majors where the lowest GPA in that major is above 3.0 (meaning even the weakest student has a good GPA).
Show Answer
SELECT
major,
MIN(gpa) AS lowest_gpa,
AVG(gpa) AS avg_gpa
FROM students
GROUP BY major
HAVING MIN(gpa) > 3.0;
Challenge Exercise: Complete Analysis
Task: Find majors for students with GPA β₯ 3.0, show count and average, but only include majors with at least 15 qualifying students AND average above 3.5. Sort by average GPA descending.
Show Answer
SELECT
major,
COUNT(*) AS student_count,
AVG(gpa) AS avg_gpa
FROM students
WHERE gpa >= 3.0
GROUP BY major
HAVING COUNT(*) >= 15 AND AVG(gpa) > 3.5
ORDER BY avg_gpa DESC;
Breakdown: WHERE filters individual students β GROUP BY creates groups β HAVING filters groups β ORDER BY sorts results
π Key Takeaways
- HAVING filters groups, WHERE filters rows
- Use HAVING when your condition involves aggregate functions
- HAVING comes after GROUP BY in execution order
- You can use WHERE and HAVING together (and often should!)
- WHERE runs first (more efficient), HAVING runs after grouping
- HAVING can use AND, OR for multiple conditions
- Can filter by COUNT, SUM, AVG, MIN, MAX in HAVING
- For non-aggregate conditions, use WHERE instead of HAVING for better performance
- Most databases require repeating the aggregate function in HAVING (can't use alias)
- Essential for business analytics: "Show me only high performers"
π Quick Decision Guide: WHERE or HAVING?
- β Use WHERE if: Filtering individual rows, no aggregates involved
- β Use HAVING if: Filtering based on aggregate results (COUNT, SUM, AVG, etc.)
- β Use BOTH if: Filter rows first, then filter grouped results
π What's Next?
Excellent work! You now understand the complete data filtering and grouping pipeline: WHERE filters rows, GROUP BY creates groups, and HAVING filters those groups. This combination is incredibly powerful for business analytics and reporting. Next, you'll learn about JOINsβhow to combine data from multiple tables. This is where SQL becomes truly enterprise-level!
Up next: Introduction to JOINs - Combining Data from Multiple Tables