← Back to Lessons
INTERMEDIATE ⏱ 40 minutes

Introduction to Joins: Connecting Related Tables

Real-world databases don't store everything in a single table. Customer data lives in one table, orders in another, products in a third. Joins are how you connect these related tables to answer meaningful questions like "What did customer Alice order?" or "Which products have never been purchased?" This lesson introduces the fundamental concept of joinsβ€”why they exist, how they work, and how to think about relationships between tables. Mastering joins transforms you from writing simple queries to building sophisticated data analyses.


πŸ“š What You'll Learn

1. Why Databases Use Multiple Tables

Imagine storing all customer and order information in a single table. You'd repeat customer details (name, email, address) for every order they place. This wastes space, creates inconsistencies, and makes updates difficult. Instead, we normalize data: customers in one table, orders in another, connected by keys.

The Problem: Denormalized Data

-- ❌ BAD: Everything in one table (denormalized)
orders_denormalized
+----------+---------------+------------------+------------+------------+
| order_id | customer_name | customer_email   | product    | quantity   |
+----------+---------------+------------------+------------+------------+
| 1        | Alice Johnson | alice@email.com  | Laptop     | 1          |
| 2        | Alice Johnson | alice@email.com  | Mouse      | 2          |
| 3        | Alice Johnson | alice@email.com  | Keyboard   | 1          |
| 4        | Bob Smith     | bob@email.com    | Monitor    | 2          |
+----------+---------------+------------------+------------+------------+

Problems:
β€’ Alice's info repeated 3 times (data redundancy)
β€’ If Alice changes email, must update 3 rows (update anomaly)
β€’ Wastes storage space
β€’ Risk of inconsistent data (typos, partial updates)

The Solution: Normalized Data with Relationships

-- βœ… GOOD: Split into related tables (normalized)

customers
+-------------+---------------+------------------+
| customer_id | name          | email            |
+-------------+---------------+------------------+
| 1           | Alice Johnson | alice@email.com  |
| 2           | Bob Smith     | bob@email.com    |
+-------------+---------------+------------------+

orders
+----------+-------------+------------+----------+
| order_id | customer_id | product    | quantity |
+----------+-------------+------------+----------+
| 1        | 1           | Laptop     | 1        |
| 2        | 1           | Mouse      | 2        |
| 3        | 1           | Keyboard   | 1        |
| 4        | 2           | Monitor    | 2        |
+----------+-------------+------------+----------+

Benefits:
β€’ Customer info stored once (no redundancy)
β€’ Update Alice's email in one place
β€’ Saves storage space
β€’ Maintains data consistency

2. How Tables Connect: Primary and Foreign Keys

Primary Keys: Unique Identifiers

Every table should have a primary keyβ€”a column (or combination of columns) that uniquely identifies each row. Primary keys are typically numeric IDs that auto-increment.

CREATE TABLE customers (
    customer_id INT PRIMARY KEY AUTO_INCREMENT,  -- Primary key
    name VARCHAR(255),
    email VARCHAR(255)
);

Foreign Keys: References to Other Tables

A foreign key is a column in one table that references the primary key of another table. This creates the relationship between tables.

CREATE TABLE orders (
    order_id INT PRIMARY KEY AUTO_INCREMENT,
    customer_id INT,  -- Foreign key referencing customers.customer_id
    product VARCHAR(255),
    quantity INT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

Visualizing the Relationship

customers                    orders
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ customer_id │────────────→│ customer_id β”‚  (Foreign Key)
β”‚ name        β”‚             β”‚ order_id    β”‚
β”‚ email       β”‚             β”‚ product     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚ quantity    β”‚
                            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Relationship: One customer can have many orders (One-to-Many)
β€’ customer_id in orders points to customer_id in customers
β€’ This link lets us connect order data to customer data

3. What Joins Accomplish

A join combines rows from two or more tables based on a related column. It "stitches together" data that's been separated for efficiency, allowing you to query across tables as if they were one.

Without Joins: Limited Queries

-- Can only see customer IDs, not names
SELECT * FROM orders;
+----------+-------------+----------+----------+
| order_id | customer_id | product  | quantity |
+----------+-------------+----------+----------+
| 1        | 1           | Laptop   | 1        |
| 2        | 1           | Mouse    | 2        |
| 4        | 2           | Monitor  | 2        |
+----------+-------------+----------+----------+

-- Can only see customer info, no order details
SELECT * FROM customers;
+-------------+---------------+------------------+
| customer_id | name          | email            |
+-------------+---------------+------------------+
| 1           | Alice Johnson | alice@email.com  |
| 2           | Bob Smith     | bob@email.com    |
+-------------+---------------+------------------+

With Joins: Complete Picture

-- Join to see both customer info AND order details
SELECT 
    o.order_id,
    c.name AS customer_name,
    c.email,
    o.product,
    o.quantity
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id;

+----------+---------------+------------------+----------+----------+
| order_id | customer_name | email            | product  | quantity |
+----------+---------------+------------------+----------+----------+
| 1        | Alice Johnson | alice@email.com  | Laptop   | 1        |
| 2        | Alice Johnson | alice@email.com  | Mouse    | 2        |
| 4        | Bob Smith     | bob@email.com    | Monitor  | 2        |
+----------+---------------+------------------+----------+----------+

Now we can answer: "What did Alice order?" "Who bought monitors?"

4. Basic Join Structure

Anatomy of a Join Query

SELECT 
    columns
FROM table1
JOIN table2 ON table1.column = table2.column;

Breaking It Down

SELECT 
    c.name,           -- Column from customers table
    o.product         -- Column from orders table
FROM customers c      -- First table (with alias 'c')
JOIN orders o         -- Second table (with alias 'o')
  ON c.customer_id = o.customer_id;  -- Join condition

Components:
1. FROM: Starting table (customers)
2. JOIN: Table to connect (orders)
3. ON: How tables relate (matching customer_id values)
4. SELECT: Which columns to display from either table

Table Aliases

When joining tables, use aliases (short names) to make queries readable. Without aliases, you'd write customers.customer_id repeatedly. With aliases, just write c.customer_id.

-- Without aliases (verbose, hard to read)
SELECT customers.name, orders.product
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id;

-- With aliases (clean, readable)
SELECT c.name, o.product
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id;

5. Matching vs. Non-Matching Rows

Understanding what happens when rows match or don't match is crucial for choosing the right join type.

Matching Rows

customers
+-------------+---------------+
| customer_id | name          |
+-------------+---------------+
| 1           | Alice Johnson |
| 2           | Bob Smith     |
| 3           | Carol Davis   |  ← No orders yet
+-------------+---------------+

orders
+----------+-------------+----------+
| order_id | customer_id | product  |
+----------+-------------+----------+
| 1        | 1           | Laptop   |  ← Matches Alice
| 2        | 1           | Mouse    |  ← Matches Alice
| 4        | 2           | Monitor  |  ← Matches Bob
+----------+-------------+----------+

Matching rows:
β€’ Alice (customer_id=1) has 2 orders
β€’ Bob (customer_id=2) has 1 order

Non-matching row:
β€’ Carol (customer_id=3) has NO orders

What Different Joins Do

Join Type What It Returns Carol's Row?
INNER JOIN Only rows that match in BOTH tables ❌ Excluded
LEFT JOIN All from left table + matches from right βœ… Included (with NULLs)
RIGHT JOIN All from right table + matches from left Depends on direction
FULL OUTER JOIN All rows from both tables βœ… Included (with NULLs)

6. Common Table Relationships

One-to-Many (Most Common)

One row in table A relates to many rows in table B. This is the most common relationship type.

Examples:
β€’ One customer β†’ Many orders
β€’ One author β†’ Many books
β€’ One department β†’ Many employees
β€’ One category β†’ Many products

customers (1)  β†’  orders (Many)
   Alice      β†’  [Laptop, Mouse, Keyboard]
   Bob        β†’  [Monitor, Keyboard]

Many-to-Many

Many rows in table A relate to many rows in table B. Requires a junction table (also called bridge table or associative table).

Examples:
β€’ Students ↔ Courses (students take many courses, courses have many students)
β€’ Products ↔ Orders (products in many orders, orders contain many products)
β€’ Authors ↔ Books (authors write many books, books can have multiple authors)

students        enrollments        courses
  Alice    β†’    [Math, History]  ←  Math has [Alice, Bob, Carol]
  Bob      β†’    [Math, Science]  ←  Science has [Bob, David]
  Carol    β†’    [Math]           ←  History has [Alice]
-- Junction table for many-to-many
CREATE TABLE students (
    student_id INT PRIMARY KEY,
    name VARCHAR(255)
);

CREATE TABLE courses (
    course_id INT PRIMARY KEY,
    course_name VARCHAR(255)
);

CREATE TABLE enrollments (  -- Junction table
    enrollment_id INT PRIMARY KEY,
    student_id INT,
    course_id INT,
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id)
);

One-to-One (Rare)

One row in table A relates to exactly one row in table B. Less common, often used for separating sensitive data or large optional fields.

Examples:
β€’ User β†’ User_Profile (extended info)
β€’ Employee β†’ Employee_Passport (sensitive data)
β€’ Product β†’ Product_Detailed_Specs (large optional text)

users (1)  ←→  user_profiles (1)
  Alice   ←→   [Bio, Avatar, Preferences]
  Bob     ←→   [Bio, Avatar, Preferences]

7. Real-World Join Scenarios

Scenario 1: E-commerce Order History

-- Question: "Show me all orders with customer names and emails"

SELECT 
    o.order_id,
    o.order_date,
    c.name AS customer_name,
    c.email,
    o.total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
ORDER BY o.order_date DESC;

Scenario 2: Product Catalog with Categories

-- Question: "List all products with their category names"

SELECT 
    p.product_id,
    p.name AS product_name,
    p.price,
    cat.name AS category_name
FROM products p
JOIN categories cat ON p.category_id = cat.category_id
ORDER BY cat.name, p.name;

Scenario 3: Employee Department Lookup

-- Question: "Show employees with their department names"

SELECT 
    e.employee_id,
    e.name AS employee_name,
    e.position,
    d.name AS department_name,
    d.location
FROM employees e
JOIN departments d ON e.department_id = d.department_id;

8. Identifying When You Need a Join

You Need a Join When:

  • You need data from multiple tables in one result
  • You want to see related information together
  • You need to filter based on values in another table
  • You're answering questions that span table boundaries
  • You see foreign keys pointing to other tables

Question Analysis

Question Tables Needed Join Required?
"List all customers" customers ❌ No
"What did Alice order?" customers + orders βœ… Yes
"Show products by category" products + categories βœ… Yes
"Total sales per customer" customers + orders βœ… Yes
"How many orders today?" orders ❌ No

9. 🎯 Practice Exercises

Exercise 1: Identify Relationships

Given these table structures, identify the type of relationship:

authors (author_id, name)
books (book_id, title, author_id)

products (product_id, name)
orders (order_id, customer_id)
order_items (order_item_id, order_id, product_id)

users (user_id, username)
profiles (profile_id, user_id, bio, avatar)

Questions:

  • What's the relationship between authors and books?
  • What's the relationship between products and orders?
  • What's the relationship between users and profiles?
Show Solution

Answer 1: Authors and Books

One-to-Many relationship

  • One author can write many books
  • Each book has one author (based on author_id foreign key in books)
  • Foreign key: books.author_id β†’ authors.author_id

Answer 2: Products and Orders

Many-to-Many relationship

  • One product can appear in many orders
  • One order can contain many products
  • Junction table: order_items connects them
  • Foreign keys: order_items.order_id β†’ orders.order_id
  • Foreign keys: order_items.product_id β†’ products.product_id

Answer 3: Users and Profiles

One-to-One relationship

  • One user has exactly one profile
  • One profile belongs to exactly one user
  • Foreign key: profiles.user_id β†’ users.user_id
  • Typically profiles.user_id would be UNIQUE

Exercise 2: Write Your First Join

Given these tables:

CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    name VARCHAR(100)
);

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(100),
    department_id INT,
    salary DECIMAL(10, 2)
);

Task: Write a query to show each employee's name, salary, and their department name.

Show Solution
-- Solution
SELECT 
    e.name AS employee_name,
    e.salary,
    d.name AS department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
ORDER BY d.name, e.name;

-- Breaking it down:
-- 1. FROM employees e: Start with employees table, alias 'e'
-- 2. JOIN departments d: Connect to departments table, alias 'd'
-- 3. ON e.department_id = d.department_id: Match on foreign key
-- 4. SELECT: Choose columns from both tables
-- 5. ORDER BY: Sort by department, then employee name

Key Points:

  • βœ… Aliases (e, d) make the query readable
  • βœ… ON clause specifies how tables relate
  • βœ… We can select from either table (e.name, d.name)
  • βœ… Result shows employee and department together

Exercise 3: Identify Join Needs

For each question, determine which tables you'd need and whether a join is required:

  • "How many products do we have?"
  • "Which customers have placed orders in 2024?"
  • "What's the total revenue per product category?"
  • "List all employees hired after 2020"
  • "Show orders with customer email addresses"
Show Solution

Question 1: "How many products do we have?"

  • Tables needed: products
  • Join required: ❌ No - single table query
  • Query: SELECT COUNT(*) FROM products;

Question 2: "Which customers have placed orders in 2024?"

  • Tables needed: customers, orders
  • Join required: βœ… Yes - need customer names from one table, order dates from another
  • Relationship: customers.customer_id = orders.customer_id

Question 3: "What's the total revenue per product category?"

  • Tables needed: categories, products, orders (or order_items)
  • Join required: βœ… Yes - need category names, product info, and sales data
  • Multiple joins needed!

Question 4: "List all employees hired after 2020"

  • Tables needed: employees
  • Join required: ❌ No - single table with WHERE clause
  • Query: SELECT * FROM employees WHERE hire_date > '2020-12-31';

Question 5: "Show orders with customer email addresses"

  • Tables needed: orders, customers
  • Join required: βœ… Yes - emails in customers, order data in orders
  • Relationship: orders.customer_id = customers.customer_id

πŸ“ Key Takeaways