Aggregations with Joins: Combining GROUP BY with Joins
The most powerful SQL queries combine joins and aggregationsβconnecting related tables while simultaneously calculating totals, averages, counts, and other summary statistics. This is how you answer business-critical questions: "What's our total revenue per customer?" "Which products have the highest average ratings?" "How many orders does each salesperson handle monthly?" Understanding how GROUP BY interacts with joins, how to avoid common aggregation pitfalls, and how to structure complex analytical queries is the culmination of your SQL journey. This is where data analysis truly begins.
π What You'll Learn
- How to combine JOIN with GROUP BY
- How to summarize data across relationships
- How to count related rows with aggregation
- How to avoid common grouping mistakes
- How joins affect aggregate results
- How to use HAVING with joined data
- How to create analytical reports from multiple tables
- How to handle NULL in aggregations
- Performance optimization for aggregate queries
1. The Basic Pattern: JOIN + GROUP BY
Pattern: JOIN tables β GROUP BY to aggregate β Calculate summary statistics. The aggregation happens after the join creates the full dataset.
Simple Example: Count Orders per Customer
-- How many orders does each customer have?
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
ORDER BY order_count DESC;
-- Process:
-- 1. JOIN customers with orders
-- 2. GROUP BY customer
-- 3. COUNT orders per customer
-- 4. Result: Each customer with their order count
Why LEFT JOIN vs INNER JOIN Matters
-- INNER JOIN: Only customers who HAVE orders
SELECT
c.name,
COUNT(o.order_id) AS order_count
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.name;
-- Excludes customers with 0 orders!
-- LEFT JOIN: ALL customers, including those with 0 orders
SELECT
c.name,
COUNT(o.order_id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.name;
-- Includes customers with 0 orders (COUNT returns 0)
2. Aggregate Functions with Joins
Common Aggregate Functions
-- Customer spending analysis
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS order_count, -- How many orders
SUM(o.total) AS total_spent, -- Total spending
AVG(o.total) AS avg_order_value, -- Average per order
MIN(o.total) AS smallest_order, -- Lowest order
MAX(o.total) AS largest_order, -- Highest order
MIN(o.order_date) AS first_order_date, -- When first ordered
MAX(o.order_date) AS last_order_date -- Most recent order
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
ORDER BY total_spent DESC;
3. Counting Related Rows Correctly
β οΈ Important: COUNT(*) counts all rows including NULLs, while COUNT(column) ignores NULLs. Use COUNT(column) with LEFT JOIN to get correct zero counts.
COUNT() Variations
-- Different COUNT behaviors
SELECT
c.name,
COUNT(*) AS total_rows, -- Counts all rows (even NULLs)
COUNT(o.order_id) AS order_count, -- Counts non-NULL orders
COUNT(o.total) AS orders_with_total -- Counts orders with total
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.name;
-- For customers without orders:
-- total_rows = 1 (the customer row exists)
-- order_count = 0 (no order_id values)
-- orders_with_total = 0 (no total values)
COUNT DISTINCT
-- Count unique products ordered per customer
SELECT
c.name,
COUNT(DISTINCT oi.product_id) AS unique_products_ordered,
COUNT(oi.order_item_id) AS total_items_ordered
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
LEFT JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY c.name
ORDER BY unique_products_ordered DESC;
-- DISTINCT prevents counting same product multiple times
4. GROUP BY Multiple Dimensions
Two-Dimensional Grouping
-- Sales by category by month
SELECT
cat.name AS category,
DATE_FORMAT(o.order_date, '%Y-%m') AS month,
COUNT(DISTINCT o.order_id) AS order_count,
SUM(oi.quantity * oi.price) AS revenue
FROM categories cat
INNER JOIN products p ON cat.category_id = p.category_id
INNER JOIN order_items oi ON p.product_id = oi.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= '2024-01-01'
GROUP BY cat.name, DATE_FORMAT(o.order_date, '%Y-%m')
ORDER BY month, category;
-- Creates a matrix: categories Γ months
Three-Dimensional Analysis
-- Sales by country, category, and quarter
SELECT
c.country,
cat.name AS category,
CONCAT('Q', QUARTER(o.order_date)) AS quarter,
COUNT(DISTINCT o.order_id) AS orders,
SUM(oi.quantity * oi.price) AS revenue,
COUNT(DISTINCT c.customer_id) AS unique_customers
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
INNER JOIN categories cat ON p.category_id = cat.category_id
WHERE YEAR(o.order_date) = 2024
GROUP BY c.country, cat.name, QUARTER(o.order_date)
ORDER BY c.country, quarter, revenue DESC;
5. HAVING: Filtering Aggregated Results
WHERE filters before aggregation, HAVING filters after aggregation. Use HAVING to filter on aggregate values (COUNT, SUM, AVG, etc.).
WHERE vs HAVING
-- Find high-value customers (spent over $1000)
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS order_count,
SUM(o.total) AS total_spent
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2024-01-01' -- WHERE: Filter rows before aggregation
GROUP BY c.customer_id, c.name
HAVING SUM(o.total) > 1000 -- HAVING: Filter groups after aggregation
ORDER BY total_spent DESC;
-- Process:
-- 1. WHERE filters to 2024 orders
-- 2. GROUP BY aggregates per customer
-- 3. HAVING keeps only customers with total > $1000
Multiple HAVING Conditions
-- Find frequent high-value customers
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS order_count,
SUM(o.total) AS total_spent,
AVG(o.total) AS avg_order_value
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
HAVING COUNT(o.order_id) >= 5 -- At least 5 orders
AND SUM(o.total) > 1000 -- Spent over $1000
AND AVG(o.total) > 100 -- Average order over $100
ORDER BY total_spent DESC;
6. Common Aggregation Mistakes with Joins
Mistake 1: Forgetting GROUP BY Columns
β ERROR: Column not in GROUP BY
-- β WRONG: name is in SELECT but not in GROUP BY
SELECT
c.name,
COUNT(o.order_id)
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id; -- ERROR! name is missing
-- Error: "c.name must be in GROUP BY or be an aggregate"
β CORRECT: Include all non-aggregate columns
-- β
CORRECT: Both customer_id and name in GROUP BY
SELECT
c.customer_id,
c.name,
COUNT(o.order_id)
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;
Mistake 2: Row Multiplication in Joins
β PROBLEM: Double-counting due to multiple joins
-- β WRONG: Can double-count if customer has multiple addresses
SELECT
c.name,
COUNT(o.order_id) AS order_count -- May be inflated!
FROM customers c
LEFT JOIN addresses a ON c.customer_id = a.customer_id
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.name;
-- If customer has 2 addresses and 3 orders:
-- Join creates 2 Γ 3 = 6 rows
-- COUNT returns 6 instead of 3!
β SOLUTION: Use COUNT DISTINCT or subqueries
-- β
SOLUTION 1: COUNT DISTINCT
SELECT
c.name,
COUNT(DISTINCT o.order_id) AS order_count
FROM customers c
LEFT JOIN addresses a ON c.customer_id = a.customer_id
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.name;
-- β
SOLUTION 2: Separate subqueries
SELECT
c.name,
(SELECT COUNT(*) FROM orders WHERE customer_id = c.customer_id) AS order_count,
(SELECT COUNT(*) FROM addresses WHERE customer_id = c.customer_id) AS address_count
FROM customers c;
7. Advanced Aggregation Patterns
Conditional Aggregation with CASE
-- Customer segmentation: count by order status
SELECT
c.customer_id,
c.name,
COUNT(o.order_id) AS total_orders,
SUM(CASE WHEN o.status = 'completed' THEN 1 ELSE 0 END) AS completed_orders,
SUM(CASE WHEN o.status = 'pending' THEN 1 ELSE 0 END) AS pending_orders,
SUM(CASE WHEN o.status = 'cancelled' THEN 1 ELSE 0 END) AS cancelled_orders,
SUM(CASE WHEN o.status = 'completed' THEN o.total ELSE 0 END) AS revenue
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
ORDER BY revenue DESC;
Percentage Calculations
-- Category contribution to total sales
SELECT
cat.name AS category,
COUNT(DISTINCT o.order_id) AS orders,
SUM(oi.quantity * oi.price) AS category_revenue,
ROUND(
SUM(oi.quantity * oi.price) * 100.0 / (
SELECT SUM(quantity * price)
FROM order_items oi2
JOIN orders o2 ON oi2.order_id = o2.order_id
WHERE o2.order_date >= '2024-01-01'
),
2
) AS percent_of_total
FROM categories cat
INNER JOIN products p ON cat.category_id = p.category_id
INNER JOIN order_items oi ON p.product_id = oi.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= '2024-01-01'
GROUP BY cat.name
ORDER BY category_revenue DESC;
Running Totals and Comparisons
-- Compare current period to previous period
SELECT
cat.name AS category,
SUM(CASE
WHEN o.order_date >= '2024-01-01'
THEN oi.quantity * oi.price
ELSE 0
END) AS revenue_2024,
SUM(CASE
WHEN o.order_date >= '2023-01-01'
AND o.order_date < '2024-01-01'
THEN oi.quantity * oi.price
ELSE 0
END) AS revenue_2023,
ROUND(
(SUM(CASE
WHEN o.order_date >= '2024-01-01'
THEN oi.quantity * oi.price
ELSE 0
END) - SUM(CASE
WHEN o.order_date >= '2023-01-01'
AND o.order_date < '2024-01-01'
THEN oi.quantity * oi.price
ELSE 0
END)) * 100.0 / NULLIF(SUM(CASE
WHEN o.order_date >= '2023-01-01'
AND o.order_date < '2024-01-01'
THEN oi.quantity * oi.price
ELSE 0
END), 0),
2
) AS percent_change
FROM categories cat
INNER JOIN products p ON cat.category_id = p.category_id
INNER JOIN order_items oi ON p.product_id = oi.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= '2023-01-01'
GROUP BY cat.name
ORDER BY revenue_2024 DESC;
8. Real-World Analytical Reports
Customer Lifetime Value (CLV) Report
-- Comprehensive customer value analysis
SELECT
c.customer_id,
c.name,
c.email,
c.registration_date,
DATEDIFF(CURRENT_DATE, c.registration_date) AS days_as_customer,
-- Order metrics
COUNT(o.order_id) AS lifetime_orders,
MIN(o.order_date) AS first_order_date,
MAX(o.order_date) AS last_order_date,
DATEDIFF(MAX(o.order_date), MIN(o.order_date)) AS days_active,
-- Revenue metrics
SUM(o.total) AS lifetime_value,
AVG(o.total) AS avg_order_value,
MIN(o.total) AS min_order_value,
MAX(o.total) AS max_order_value,
-- Product diversity
COUNT(DISTINCT oi.product_id) AS unique_products_purchased,
COUNT(DISTINCT cat.category_id) AS unique_categories_purchased,
-- Engagement metrics
CASE
WHEN MAX(o.order_date) >= DATE_SUB(CURRENT_DATE, INTERVAL 90 DAY) THEN 'Active'
WHEN MAX(o.order_date) >= DATE_SUB(CURRENT_DATE, INTERVAL 180 DAY) THEN 'At Risk'
ELSE 'Churned'
END AS customer_status
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
LEFT JOIN order_items oi ON o.order_id = oi.order_id
LEFT JOIN products p ON oi.product_id = p.product_id
LEFT JOIN categories cat ON p.category_id = cat.category_id
GROUP BY
c.customer_id, c.name, c.email, c.registration_date
ORDER BY lifetime_value DESC;
Product Performance Dashboard
-- Product sales and review analysis
SELECT
p.product_id,
p.name AS product_name,
cat.name AS category_name,
-- Sales metrics
COUNT(DISTINCT o.order_id) AS orders_containing_product,
SUM(oi.quantity) AS total_units_sold,
SUM(oi.quantity * oi.price) AS total_revenue,
AVG(oi.price) AS avg_selling_price,
-- Customer metrics
COUNT(DISTINCT c.customer_id) AS unique_customers,
-- Review metrics
COUNT(r.review_id) AS review_count,
AVG(r.rating) AS avg_rating,
SUM(CASE WHEN r.rating >= 4 THEN 1 ELSE 0 END) AS positive_reviews,
SUM(CASE WHEN r.rating <= 2 THEN 1 ELSE 0 END) AS negative_reviews,
-- Recency
MAX(o.order_date) AS last_sold_date,
DATEDIFF(CURRENT_DATE, MAX(o.order_date)) AS days_since_last_sale
FROM products p
INNER JOIN categories cat ON p.category_id = cat.category_id
LEFT JOIN order_items oi ON p.product_id = oi.product_id
LEFT JOIN orders o ON oi.order_id = o.order_id
LEFT JOIN customers c ON o.customer_id = c.customer_id
LEFT JOIN reviews r ON p.product_id = r.product_id
GROUP BY
p.product_id, p.name, cat.name
HAVING COUNT(DISTINCT o.order_id) > 0 -- Only products with sales
ORDER BY total_revenue DESC;
9. Performance Optimization
Optimization Strategies
- β Index all foreign keys and GROUP BY columns
- β Filter with WHERE before GROUP BY
- β Use INNER JOIN when you don't need NULLs
- β Avoid SELECT * in aggregations
- β Use covering indexes for aggregation queries
- β Consider materialized views for complex reports
Index for Aggregations
-- Indexes for common aggregation queries
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
CREATE INDEX idx_orders_date_status ON orders(order_date, status);
CREATE INDEX idx_order_items_product ON order_items(product_id, order_id);
-- These compound indexes help GROUP BY queries
10. π― Practice Exercises
Exercise 1: Sales Analytics
Given this schema:
customers (customer_id, name, country, registration_date)
orders (order_id, customer_id, order_date, total, status)
order_items (item_id, order_id, product_id, quantity, price)
products (product_id, name, category_id)
categories (category_id, name)
Tasks:
- Calculate total revenue per category
- Find customers who spent over $5000 total
- Show average order value per country
Show Solution
-- Task 1: Revenue per category
SELECT
cat.name AS category,
COUNT(DISTINCT o.order_id) AS total_orders,
SUM(oi.quantity) AS total_units_sold,
SUM(oi.quantity * oi.price) AS total_revenue,
AVG(oi.price) AS avg_unit_price
FROM categories cat
INNER JOIN products p ON cat.category_id = p.category_id
INNER JOIN order_items oi ON p.product_id = oi.product_id
INNER JOIN orders o ON oi.order_id = o.order_id
WHERE o.status = 'completed'
GROUP BY cat.name
ORDER BY total_revenue DESC;
-- Task 2: High-value customers (over $5000)
SELECT
c.customer_id,
c.name,
c.email,
COUNT(o.order_id) AS order_count,
SUM(o.total) AS total_spent,
AVG(o.total) AS avg_order_value
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.customer_id, c.name, c.email
HAVING SUM(o.total) > 5000
ORDER BY total_spent DESC;
-- Task 3: Average order value per country
SELECT
c.country,
COUNT(DISTINCT c.customer_id) AS customer_count,
COUNT(o.order_id) AS total_orders,
SUM(o.total) AS total_revenue,
AVG(o.total) AS avg_order_value,
MIN(o.total) AS min_order,
MAX(o.total) AS max_order
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.country
ORDER BY avg_order_value DESC;
Key Techniques:
- β Multi-table joins before aggregation
- β HAVING filters aggregated results
- β Multiple aggregate functions per query
- β COUNT DISTINCT for unique counts
Exercise 2: Product Review Analysis
Given this schema:
products (product_id, name, category_id, price)
categories (category_id, name)
reviews (review_id, product_id, customer_id, rating, review_date)
customers (customer_id, name, loyalty_tier)
Tasks:
- Find products with average rating above 4.0 and at least 10 reviews
- Calculate review distribution by category (count of 1-star, 2-star, etc.)
- Find customers who have reviewed 5+ products
Show Solution
-- Task 1: Highly-rated products (4.0+, 10+ reviews)
SELECT
p.product_id,
p.name AS product_name,
cat.name AS category,
p.price,
COUNT(r.review_id) AS review_count,
AVG(r.rating) AS avg_rating,
MIN(r.rating) AS min_rating,
MAX(r.rating) AS max_rating
FROM products p
INNER JOIN categories cat ON p.category_id = cat.category_id
INNER JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.name, cat.name, p.price
HAVING COUNT(r.review_id) >= 10
AND AVG(r.rating) >= 4.0
ORDER BY avg_rating DESC, review_count DESC;
-- Task 2: Review distribution by category
SELECT
cat.name AS category,
COUNT(r.review_id) AS total_reviews,
SUM(CASE WHEN r.rating = 5 THEN 1 ELSE 0 END) AS five_star,
SUM(CASE WHEN r.rating = 4 THEN 1 ELSE 0 END) AS four_star,
SUM(CASE WHEN r.rating = 3 THEN 1 ELSE 0 END) AS three_star,
SUM(CASE WHEN r.rating = 2 THEN 1 ELSE 0 END) AS two_star,
SUM(CASE WHEN r.rating = 1 THEN 1 ELSE 0 END) AS one_star,
AVG(r.rating) AS avg_rating,
ROUND(
SUM(CASE WHEN r.rating >= 4 THEN 1 ELSE 0 END) * 100.0 / COUNT(r.review_id),
1
) AS percent_positive
FROM categories cat
INNER JOIN products p ON cat.category_id = p.category_id
INNER JOIN reviews r ON p.product_id = r.product_id
GROUP BY cat.name
ORDER BY avg_rating DESC;
-- Task 3: Active reviewers (5+ products reviewed)
SELECT
c.customer_id,
c.name,
c.loyalty_tier,
COUNT(DISTINCT r.product_id) AS products_reviewed,
COUNT(r.review_id) AS total_reviews,
AVG(r.rating) AS avg_rating_given,
MIN(r.review_date) AS first_review,
MAX(r.review_date) AS most_recent_review
FROM customers c
INNER JOIN reviews r ON c.customer_id = r.customer_id
GROUP BY c.customer_id, c.name, c.loyalty_tier
HAVING COUNT(DISTINCT r.product_id) >= 5
ORDER BY products_reviewed DESC;
Advanced Aggregation:
- β Conditional aggregation with CASE
- β Percentage calculations
- β Multiple HAVING conditions
- β COUNT DISTINCT vs COUNT
π₯ Challenge: Executive Dashboard Query
Create a comprehensive monthly business dashboard:
customers (customer_id, name, country, registration_date, loyalty_tier)
orders (order_id, customer_id, order_date, total, status)
order_items (item_id, order_id, product_id, quantity, price)
products (product_id, name, category_id, cost)
categories (category_id, name)
reviews (review_id, product_id, rating, review_date)
Challenge: Create a monthly summary showing: orders, revenue, profit, customer metrics, product performance, review stats, and year-over-year comparison.
Show Solution
-- Executive monthly dashboard
SELECT
DATE_FORMAT(o.order_date, '%Y-%m') AS month,
-- Order metrics
COUNT(DISTINCT o.order_id) AS total_orders,
COUNT(DISTINCT CASE WHEN o.status = 'completed' THEN o.order_id END) AS completed_orders,
COUNT(DISTINCT CASE WHEN o.status = 'cancelled' THEN o.order_id END) AS cancelled_orders,
-- Revenue metrics
SUM(CASE WHEN o.status = 'completed' THEN o.total ELSE 0 END) AS total_revenue,
AVG(CASE WHEN o.status = 'completed' THEN o.total END) AS avg_order_value,
-- Profit metrics (revenue - cost)
SUM(CASE
WHEN o.status = 'completed'
THEN oi.quantity * (oi.price - p.cost)
ELSE 0
END) AS gross_profit,
ROUND(
SUM(CASE
WHEN o.status = 'completed'
THEN oi.quantity * (oi.price - p.cost)
ELSE 0
END) * 100.0 / NULLIF(SUM(CASE
WHEN o.status = 'completed'
THEN oi.quantity * oi.price
ELSE 0
END), 0),
2
) AS profit_margin_percent,
-- Customer metrics
COUNT(DISTINCT o.customer_id) AS active_customers,
COUNT(DISTINCT CASE
WHEN c.registration_date >= DATE_FORMAT(o.order_date, '%Y-%m-01')
THEN c.customer_id
END) AS new_customers,
-- Product metrics
COUNT(DISTINCT oi.product_id) AS products_sold,
SUM(oi.quantity) AS units_sold,
-- Review metrics
COUNT(DISTINCT r.review_id) AS reviews_received,
AVG(r.rating) AS avg_rating
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.product_id
LEFT JOIN reviews r
ON p.product_id = r.product_id
AND DATE_FORMAT(r.review_date, '%Y-%m') = DATE_FORMAT(o.order_date, '%Y-%m')
WHERE o.order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 12 MONTH)
GROUP BY DATE_FORMAT(o.order_date, '%Y-%m')
ORDER BY month DESC;
Bonus: Year-over-Year Comparison
-- Year-over-year monthly comparison
SELECT
MONTH(o.order_date) AS month_number,
DATE_FORMAT(o.order_date, '%M') AS month_name,
-- Current year
SUM(CASE
WHEN YEAR(o.order_date) = YEAR(CURRENT_DATE)
AND o.status = 'completed'
THEN o.total
ELSE 0
END) AS revenue_current_year,
-- Previous year
SUM(CASE
WHEN YEAR(o.order_date) = YEAR(CURRENT_DATE) - 1
AND o.status = 'completed'
THEN o.total
ELSE 0
END) AS revenue_previous_year,
-- Growth
ROUND(
(SUM(CASE
WHEN YEAR(o.order_date) = YEAR(CURRENT_DATE)
AND o.status = 'completed'
THEN o.total
ELSE 0
END) - SUM(CASE
WHEN YEAR(o.order_date) = YEAR(CURRENT_DATE) - 1
AND o.status = 'completed'
THEN o.total
ELSE 0
END)) * 100.0 / NULLIF(SUM(CASE
WHEN YEAR(o.order_date) = YEAR(CURRENT_DATE) - 1
AND o.status = 'completed'
THEN o.total
ELSE 0
END), 0),
2
) AS yoy_growth_percent
FROM orders o
WHERE o.order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 24 MONTH)
GROUP BY MONTH(o.order_date), DATE_FORMAT(o.order_date, '%M')
ORDER BY month_number;
Enterprise Techniques:
- β Complex conditional aggregation
- β Multiple metrics in single query
- β Profit calculations (revenue - cost)
- β Period-over-period comparisons
- β NULLIF prevents division by zero
- β Date formatting for reporting
- π‘ This powers real business dashboards!
π Key Takeaways
- JOIN + GROUP BY combines related data and aggregates it
- LEFT JOIN includes rows with zero counts (important for completeness)
- COUNT(column) ignores NULL, COUNT(*) counts all rows
- Use COUNT DISTINCT to avoid double-counting
- WHERE filters before aggregation, HAVING filters after
- Include all non-aggregate columns in GROUP BY
- Watch for row multiplication in multi-table aggregations
- Use CASE in aggregations for conditional counts/sums
- Index foreign keys and GROUP BY columns
- Filter early with WHERE to reduce aggregation load
- Aggregate functions: COUNT, SUM, AVG, MIN, MAX
- Use subqueries to avoid multiplication in complex joins
- HAVING supports multiple conditions with AND/OR
- Aggregations with joins power business intelligence reports
- This is the foundation of data analytics in SQL
π Congratulations!
You've completed all 30 SQL lessons! You now have the skills to:
- Query data with SELECT, WHERE, ORDER BY, LIMIT
- Design normalized databases with proper relationships
- Enforce data integrity with constraints
- Create and modify table structures (DDL)
- Manipulate data with INSERT, UPDATE, DELETE (DML)
- Master all join types (INNER, LEFT, RIGHT, OUTER, CROSS, SELF)
- Analyze data with aggregations and GROUP BY
You're now ready to build real-world database applications!