Introduction to Relational Databases
You've learned how to query data with SQLβselecting, filtering, grouping, and analyzing. But where does that data come from? How is it organized? Why do we have multiple tables instead of one giant spreadsheet? Welcome to relational databasesβthe foundation that makes all your SQL queries possible. Understanding how data is structured and related is crucial for becoming a true database professional!
π What You'll Learn
- What a relational database is and why it's called "relational"
- The fundamental building blocks: tables, rows, and columns
- What primary keys are and why every table needs one
- How foreign keys create relationships between tables
- The difference between one-to-many, many-to-many, and one-to-one relationships
- Why we split data into multiple tables instead of one big table
- The problems with storing everything in a single table (spoiler: lots!)
- How relationships eliminate data redundancy
- Real-world examples of relational database design
- Why relational databases are the standard for most applications
1. What is a Relational Database?
A relational database is a way of organizing data into tables that are related to each other through shared values. It's called "relational" because tables can have relationships with other tables.
π‘ Real-World Analogy: Think of a library. You don't put all information on one giant card. Instead, you have:
β’ A Books table with book titles, authors, ISBNs
β’ An Authors table with author names, birth dates, countries
β’ A Members table with member names, contact info
β’ A Checkouts table connecting members to books they borrowed
These tables are relatedβyou can connect a book to its author, a checkout to a member. That's a relational database!
Why "Relational"?
The name comes from mathematical relations (set theory), but practically it means: tables are connected through relationships. A customer "has many" orders. An order "belongs to" a customer. These connections make relational databases powerful!
2. The Building Blocks: Tables, Rows, and Columns
Tables (Relations)
A table stores data about a specific entity (thing). Each table represents one type of object.
| Table Name | What It Stores | Example Columns |
|---|---|---|
| students | Student information | student_id, name, email, major |
| courses | Course details | course_id, title, credits, department |
| enrollments | Who's taking what | enrollment_id, student_id, course_id, grade |
Rows (Records/Tuples)
A row is a single entry in a table. Each row represents one instance of the entity.
Example: In the students table, each row is one student:
| student_id | name | major | |
|---|---|---|---|
| 1 | Alice Johnson | alice@university.edu | Computer Science |
| 2 | Bob Smith | bob@university.edu | Biology |
| 3 | Carol Davis | carol@university.edu | Mathematics |
Each colored row represents one student with all their information.
Columns (Fields/Attributes)
A column is a specific piece of information stored for each row. Columns define what kind of data the table stores.
Example columns in students table:
β’ student_id β Unique identifier (number)
β’ name β Student's full name (text)
β’ email β Contact email (text)
β’ major β Field of study (text)
3. Primary Keys - The Unique Identifier
A primary key is a column (or combination of columns) that uniquely identifies each row in a table. No two rows can have the same primary key value.
β‘ Golden Rule: Every table MUST have a primary key! It's the unique identifier that distinguishes one row from another. Without it, you can't reliably reference specific records.
Why Primary Keys Matter
| Without Primary Key | With Primary Key |
|---|---|
|
Problem: Two students named "John Smith" How do you specify which one? |
Solution: student_id 42 and student_id 73 Clear, unique reference! |
Example: students Table with Primary Key
| π student_id (PRIMARY KEY) | name | |
|---|---|---|
| 100234 | Alice Johnson | alice@university.edu |
| 100567 | Bob Smith | bob@university.edu |
| 100891 | Carol Davis | carol@university.edu |
Primary Key Rules
- Unique: No two rows can have the same primary key value
- Not NULL: Primary key cannot be empty/NULL
- Unchanging: Primary key values should never change
- Simple: Usually a single column (like student_id), though composite keys exist
π‘ Pro Tip: Most tables use an auto-incrementing integer as the primary key (1, 2, 3, 4...). This is simple, efficient, and guaranteed unique. You'll often see columns named id, student_id, order_id, etc.
4. Foreign Keys - Creating Relationships
A foreign key is a column in one table that references the primary key of another table. This creates a relationship between the two tables.
Example: Students and Enrollments
Students Table
| π student_id | name | major |
|---|---|---|
| 100234 | Alice Johnson | Computer Science |
| 100567 | Bob Smith | Biology |
Enrollments Table
| π enrollment_id | π student_id (FK) | course_code | semester |
|---|---|---|---|
| 1001 | 100234 | CS101 | Fall 2024 |
| 1002 | 100234 | CS201 | Fall 2024 |
| 1003 | 100567 | BIO101 | Fall 2024 |
The student_id column in the enrollments table is a foreign key that references the student_id primary key in the students table. This creates a relationship: "Each enrollment belongs to a student."
π Reading the Relationship:
Enrollment 1001 has student_id = 100234 β This enrollment belongs to Alice Johnson
Enrollment 1002 has student_id = 100234 β This enrollment also belongs to Alice Johnson
Enrollment 1003 has student_id = 100567 β This enrollment belongs to Bob Smith
5. Types of Relationships
One-to-Many (Most Common)
One record in Table A relates to many records in Table B.
Examples:
β’ One customer has many orders
β’ One student has many enrollments
β’ One author writes many books
β’ One department has many employees
1 Student
Many Enrollments
Many-to-Many
Many records in Table A relate to many records in Table B. This requires a junction table (also called a join table or bridge table).
Example: Students and Courses
β’ One student can take many courses
β’ One course can have many students
β’ Solution: Create an enrollments table that connects students to courses
Students
Enrollments
(Junction Table)
Courses
One-to-One (Rare)
One record in Table A relates to exactly one record in Table B.
Examples:
β’ One person has one passport
β’ One employee has one desk assignment
β’ One country has one capital city
6. Why Not Store Everything in One Big Table?
Great question! Why go through the trouble of multiple tables? Let's look at what happens if we don't.
β Bad Design: Everything in One Table
| student_id | student_name | student_email | course_code | course_title | credits |
|---|---|---|---|---|---|
| 100234 | Alice Johnson | alice@university.edu | CS101 | Intro to Programming | 3 |
| 100234 | Alice Johnson | alice@university.edu | CS201 | Data Structures | 4 |
| 100234 | Alice Johnson | alice@university.edu | MATH150 | Calculus I | 4 |
Problems with This Design:
- Data Redundancy: Alice's name and email are repeated 3 times! (highlighted in yellow)
- Update Anomalies: If Alice changes her email, you must update it in 3 places. Miss one? Data inconsistency!
- Insert Anomalies: Can't add a new course unless a student enrolls in it
- Delete Anomalies: If Alice drops all courses, you lose her student record entirely
- Wasted Space: Same information stored multiple times wastes disk space
- Query Complexity: Harder to get "all students" without duplicates
β Good Design: Separate Tables with Relationships
Students Table
| student_id | name | |
|---|---|---|
| 100234 | Alice Johnson | alice@university.edu |
β Alice's info stored ONCE!
Courses Table
| course_id | course_code | title | credits |
|---|---|---|---|
| 5001 | CS101 | Intro to Programming | 3 |
| 5002 | CS201 | Data Structures | 4 |
| 5003 | MATH150 | Calculus I | 4 |
Enrollments Table (Junction)
| enrollment_id | student_id (FK) | course_id (FK) |
|---|---|---|
| 1001 | 100234 | 5001 |
| 1002 | 100234 | 5002 |
| 1003 | 100234 | 5003 |
β Only IDs stored (minimal redundancy)
Benefits of This Design:
- β No Redundancy: Each piece of information stored once
- β Easy Updates: Change Alice's email in ONE place
- β Data Integrity: Can add courses without students, students without courses
- β Space Efficient: Store IDs (numbers) instead of repeating text
- β Flexible Queries: Easy to get all students, all courses, or relationships
7. Real-World Example: E-Commerce Database
Let's see how a real e-commerce system might be structured:
Tables and Their Relationships
customers (customer_id, name, email, address)
β One customer can place many orders
orders (order_id, customer_id FK, order_date, total_amount)
β One order contains many products (through order_items)
products (product_id, name, description, price, stock_quantity)
β One product can appear in many orders (through order_items)
order_items (item_id, order_id FK, product_id FK, quantity, price_at_time)
β Junction table connecting orders and products
customers (1) ββ (many) orders (many) ββ (many) products
(Connected through order_items junction table)
π Key Takeaways
- Relational databases organize data into related tables
- Tables consist of rows (records) and columns (fields)
- Every table needs a primary key (unique identifier)
- Foreign keys create relationships between tables
- Three relationship types: one-to-many (most common), many-to-many (junction table), one-to-one (rare)
- Splitting data into multiple tables eliminates redundancy
- Good design prevents update, insert, and delete anomalies
- Relationships allow flexible queries while maintaining data integrity
- Most real-world applications use relational databases (MySQL, PostgreSQL, Oracle, SQL Server)
- Understanding relationships is essential for writing effective SQL queries with JOINs
π What's Next?
Excellent work! You now understand the foundation of relational databasesβhow tables are structured and connected. But we only scratched the surface of what can go wrong with poor database design. Next, you'll learn about data quality and structure problemsβthe specific issues that arise when data isn't properly organized, and why normalization is critical for preventing them.
Up next: Understanding Data Quality & Structure