Basic String Functions
So far you've worked with text data as-is: selecting it, filtering it, searching patterns. But what if you need to transform that text? Convert names to uppercase for consistency? Extract area codes from phone numbers? Combine first and last names? SQL provides powerful string functions that let you manipulate and transform text data on the fly!
π What You'll Learn
- How SQL manipulates text using built-in functions
- Converting text to UPPER() and LOWER() case
- Measuring text length with LENGTH() or LEN()
- Extracting portions of text with SUBSTRING()
- Trimming whitespace with TRIM(), LTRIM(), RTRIM()
- Joining strings together with CONCAT() or ||
- Replacing text with REPLACE()
- How string functions work with WHERE and LIKE
- Database differences (MySQL vs PostgreSQL vs SQL Server)
- Common mistakes beginners make
- Real-world practical examples
- Hands-on practice exercises
1. What Are String Functions?
String functions are built-in tools that transform text data. They take text as input and return modified text as output.
π‘ Real-World Example: Imagine you're building a user registration system. Users type their names in all different ways: "john", "JOHN", "JoHn". Before storing or comparing names, you might convert them all to lowercase using LOWER() for consistency. That's exactly what string functions doβthey clean, standardize, and manipulate text!
Basic Syntax
SELECT FUNCTION_NAME(column_name)
FROM table_name;
The function wraps around the column name and transforms its value.
2. UPPER() and LOWER() - Changing Case
These functions convert text to all uppercase or all lowercase letters.
UPPER() - Convert to Uppercase
SELECT
name,
UPPER(name) AS name_uppercase
FROM students;
Result:
| name | name_uppercase |
|---|---|
| Alice Johnson | ALICE JOHNSON |
| bob smith | BOB SMITH |
| Carol DAVIS | CAROL DAVIS |
LOWER() - Convert to Lowercase
SELECT
email,
LOWER(email) AS email_lowercase
FROM students;
Result:
| email_lowercase | |
|---|---|
| Alice@UNIVERSITY.EDU | alice@university.edu |
| BOB@college.EDU | bob@college.edu |
Using with WHERE for Case-Insensitive Searches
-- Find "alice" regardless of how it's capitalized
SELECT name, email
FROM students
WHERE LOWER(email) LIKE '%alice%';
This matches: ALICE@..., alice@..., Alice@..., aLiCe@...
π‘ Pro Tip: UPPER() and LOWER() are essential for case-insensitive comparisons. Instead of checking for "Alice", "alice", "ALICE" separately, just convert to lowercase and compare once!
3. LENGTH() - Measuring Text Length
LENGTH() returns the number of characters in a string.
β οΈ Database Difference: In SQL Server, use LEN() instead of LENGTH(). MySQL and PostgreSQL use LENGTH().
Basic Usage
SELECT
name,
LENGTH(name) AS name_length
FROM students;
Result:
| name | name_length |
|---|---|
| Alice Johnson | 13 |
| Bob Smith | 9 |
| Carol Davis | 11 |
Using with WHERE
-- Find students with short names (less than 10 characters)
SELECT name
FROM students
WHERE LENGTH(name) < 10;
Practical Example: Validating Input
-- Find invalid phone numbers (not exactly 12 characters: XXX-XXX-XXXX)
SELECT name, phone
FROM students
WHERE LENGTH(phone) != 12;
4. SUBSTRING() - Extracting Portions of Text
SUBSTRING() extracts a portion of a string. Perfect for extracting area codes, initials, or any fixed-position data.
Syntax Variations
MySQL / PostgreSQL
SUBSTRING(string, start, length)
SQL Server
SUBSTRING(string, start, length)
Parameters:
- string: The text to extract from
- start: Starting position (1 = first character)
- length: How many characters to extract
Example: Extracting Initials
-- Get first initial (first character)
SELECT
name,
SUBSTRING(name, 1, 1) AS first_initial
FROM students;
Result:
| name | first_initial |
|---|---|
| Alice Johnson | A |
| Bob Smith | B |
| Carol Davis | C |
Example: Extracting Area Code
-- Extract area code from phone (555-123-4567)
SELECT
phone,
SUBSTRING(phone, 1, 3) AS area_code
FROM students;
Result:
| phone | area_code |
|---|---|
| 555-123-4567 | 555 |
| 800-555-1234 | 800 |
Example: Extracting Middle Characters
-- Extract characters 5-9 from email
SELECT
email,
SUBSTRING(email, 5, 5) AS middle_part
FROM students
WHERE email = 'alice@university.edu';
Starting at position 5 ('e'), extract 5 characters: 'e@uni'
5. TRIM(), LTRIM(), RTRIM() - Removing Whitespace
These functions remove extra spaces from strings. Critical for cleaning user input!
- TRIM(): Removes spaces from both ends
- LTRIM(): Removes spaces from left (beginning)
- RTRIM(): Removes spaces from right (end)
Example: Cleaning Input
SELECT
name,
TRIM(name) AS name_cleaned,
LENGTH(name) AS original_length,
LENGTH(TRIM(name)) AS cleaned_length
FROM students;
Visualizing the Difference:
| name | name_cleaned | original_length | cleaned_length |
|---|---|---|---|
| " Alice " | "Alice" | 9 | 5 |
| "Bob " | "Bob" | 6 | 3 |
π‘ Real-World Use: Users often accidentally add spaces when typing. "Alice " and "Alice" should be treated the same! TRIM() ensures consistent comparisons and storage.
6. CONCAT() - Joining Strings Together
CONCAT() combines multiple strings into one. Essential for creating full names, formatted addresses, or custom messages.
Syntax Variations
MySQL / SQL Server
CONCAT(string1, string2, ...)
PostgreSQL / Oracle
string1 || string2 || ...
Example: Creating Full Names
-- MySQL / SQL Server
SELECT
first_name,
last_name,
CONCAT(first_name, ' ', last_name) AS full_name
FROM students;
-- PostgreSQL / Oracle
SELECT
first_name,
last_name,
first_name || ' ' || last_name AS full_name
FROM students;
Result:
| first_name | last_name | full_name |
|---|---|---|
| Alice | Johnson | Alice Johnson |
| Bob | Smith | Bob Smith |
Example: Creating Formatted Messages
SELECT
name,
gpa,
CONCAT(name, ' has a GPA of ', gpa) AS message
FROM students;
Result:
| message |
|---|
| Alice Johnson has a GPA of 3.8 |
| Bob Smith has a GPA of 3.6 |
7. REPLACE() - Substituting Text
REPLACE() finds text within a string and replaces it with different text.
Syntax
REPLACE(string, old_text, new_text)
Example: Standardizing Phone Formats
-- Remove dashes from phone numbers
SELECT
phone,
REPLACE(phone, '-', '') AS phone_no_dashes
FROM students;
Result:
| phone | phone_no_dashes |
|---|---|
| 555-123-4567 | 5551234567 |
| 800-555-1234 | 8005551234 |
Example: Fixing Typos in Data
-- Fix common misspelling
SELECT
major,
REPLACE(major, 'Computor', 'Computer') AS major_corrected
FROM students;
8. Combining Functions for Complex Transformations
You can nest functions inside each other for powerful transformations:
Example: Clean and Format Names
-- Trim spaces, then convert to proper case (first letter upper)
SELECT
name,
UPPER(SUBSTRING(TRIM(name), 1, 1)) AS first_letter,
LOWER(SUBSTRING(TRIM(name), 2)) AS rest
FROM students;
Example: Extract and Format Area Code
-- Extract area code and format with parentheses
SELECT
phone,
CONCAT('(', SUBSTRING(phone, 1, 3), ')') AS area_code_formatted
FROM students;
Result:
| phone | area_code_formatted |
|---|---|
| 555-123-4567 | (555) |
9. Using String Functions with WHERE and LIKE
String functions are incredibly useful in WHERE clauses for flexible filtering:
Example: Case-Insensitive Search
-- Find "smith" regardless of capitalization
SELECT name
FROM students
WHERE LOWER(name) LIKE '%smith%';
Example: Find Students by Name Length
-- Find students with very long names
SELECT name, LENGTH(name) AS name_length
FROM students
WHERE LENGTH(name) > 15
ORDER BY LENGTH(name) DESC;
Example: Find by Area Code
-- Find students in 555 area code using SUBSTRING
SELECT name, phone
FROM students
WHERE SUBSTRING(phone, 1, 3) = '555';
10. Common Mistakes When Using String Functions
Mistake #1: Wrong Function Name for Database
-- β WRONG in SQL Server
SELECT LENGTH(name) FROM students;
-- β
CORRECT in SQL Server
SELECT LEN(name) FROM students;
Mistake #2: Forgetting String Positions Start at 1
-- β WRONG: Position 0 doesn't exist in SQL!
SELECT SUBSTRING(name, 0, 1) FROM students;
-- β
CORRECT: Positions start at 1
SELECT SUBSTRING(name, 1, 1) FROM students;
Mistake #3: Not Handling NULL
-- β οΈ If name is NULL, result is NULL
SELECT UPPER(name) FROM students;
-- β
BETTER: Handle NULL explicitly
SELECT COALESCE(UPPER(name), 'No name') FROM students;
Mistake #4: Wrong CONCAT Syntax
-- β WRONG in MySQL (|| doesn't work)
SELECT first_name || ' ' || last_name FROM students;
-- β
CORRECT in MySQL
SELECT CONCAT(first_name, ' ', last_name) FROM students;
Mistake #5: Performance Issues with Functions in WHERE
-- β SLOW: Function prevents index usage
WHERE UPPER(email) = 'ALICE@EXAMPLE.COM'
-- β
FASTER: If possible, store uppercase version
-- or use case-insensitive collation
β οΈ Performance Note: Using functions on columns in WHERE clauses prevents the database from using indexes. For production systems with large datasets, consider storing pre-processed versions or using function-based indexes.
11. π― Hands-On Practice Exercises
Exercise 1: Convert to Uppercase
Task: Display all student names in UPPERCASE.
Show Answer
SELECT
name,
UPPER(name) AS name_uppercase
FROM students;
Exercise 2: Count Characters
Task: Find students whose names are longer than 12 characters.
Show Answer
SELECT name, LENGTH(name) AS name_length
FROM students
WHERE LENGTH(name) > 12;
Exercise 3: Extract Initials
Task: Extract the first letter (initial) of each student's name.
Show Answer
SELECT
name,
SUBSTRING(name, 1, 1) AS initial
FROM students;
Exercise 4: Create Full Names
Task: Combine first_name and last_name columns into a single full_name column.
Show Answer
-- MySQL / SQL Server
SELECT
first_name,
last_name,
CONCAT(first_name, ' ', last_name) AS full_name
FROM students;
Exercise 5: Case-Insensitive Search
Task: Find all students whose email contains "GMAIL" (case-insensitive).
Show Answer
SELECT name, email
FROM students
WHERE LOWER(email) LIKE '%gmail%';
Exercise 6: Remove Formatting
Task: Display phone numbers without dashes.
Show Answer
SELECT
phone,
REPLACE(phone, '-', '') AS phone_digits_only
FROM students;
Exercise 7: Extract Domain from Email
Task: Extract everything after the @ symbol from email addresses. (Hint: This is tricky! You'll need to find the @ position first.)
Show Answer
-- This requires functions we haven't covered yet (LOCATE/POSITION)
-- Simplified version for fixed-format emails:
SELECT
email,
SUBSTRING(email, LOCATE('@', email) + 1) AS domain
FROM students;
Note: LOCATE() finds the position of '@', then SUBSTRING extracts everything after it. This is an advanced combination!
Challenge Exercise: Clean and Format
Task: Take names with extra spaces, trim them, convert to lowercase, then extract just the first 3 characters.
Show Answer
SELECT
name,
SUBSTRING(LOWER(TRIM(name)), 1, 3) AS name_code
FROM students;
Breaking it down: TRIM removes spaces, LOWER converts to lowercase, SUBSTRING extracts first 3 chars. Functions are applied inside-out!
π Key Takeaways
- UPPER() and LOWER() change text case (essential for case-insensitive searches)
- LENGTH() or LEN() measures string length in characters
- SUBSTRING() extracts portions of text (positions start at 1!)
- TRIM(), LTRIM(), RTRIM() remove whitespace
- CONCAT() or || joins strings together
- REPLACE() substitutes text within strings
- Functions can be nested for complex transformations
- String functions work with WHERE and LIKE for flexible filtering
- Syntax varies between databasesβcheck your database documentation!
- Using functions on columns in WHERE can impact performance
Database Syntax Differences
- Length: MySQL/PostgreSQL use LENGTH(), SQL Server uses LEN()
- Concatenation: MySQL/SQL Server use CONCAT(), PostgreSQL/Oracle use ||
- Substring: All use SUBSTRING(), but syntax may vary slightly
π What's Next?
Fantastic work learning string manipulation! You can now transform and clean text data like a pro. Next, you'll dive into aggregate functions (COUNT, SUM, AVG, MIN, MAX), which let you analyze and summarize your data. Want to calculate totals, averages, or find highest/lowest values? That's coming up!
Up next: Aggregate Functions - Analyzing Data with COUNT, SUM, AVG