Title: 8.3-SQL-Case-Statement-JOINs

Date: 2025-10-26 12:01

Status: Child

Tag: data types Data Analysis SQL Query

SQL Advanced Concepts: Aggregate Functions, CASE Statements, and JOINs

Overview

  • Advanced SQL concepts covered:
    • Aggregate functions with CASE statements
    • JOIN operations
    • UNION statements
  • Concepts are foundational for:
    • Working with relational databases
    • Performing complex data analysis

CASE Statements

Understanding CASE Statements

CASE statements in SQL function similarly to IF statements in Excel. They allow you to implement conditional logic within your queries.

Basic Syntax:

CASE 
    WHEN condition THEN result
    ELSE alternative_result
END

Example 1: Store-Wise Inventory Count

Business Question: How many inventory items are in each store?

SELECT 
    film_id,
    COUNT(CASE WHEN store_id = 1 THEN inventory_id END) AS store_1,
    COUNT(CASE WHEN store_id = 2 THEN inventory_id END) AS store_2,
    COUNT(inventory_id) AS total
FROM inventory
GROUP BY film_id
ORDER BY total DESC;

Explanation:

  • Uses CASE to conditionally count inventory by store
  • GROUP BY film_id aggregates data for each film
  • ORDER BY total DESC sorts results from highest to lowest total inventory

Example 2: Active and Inactive Customers by Store

Business Question: How many active and inactive customers does each store have?

SELECT 
    store_id,
    COUNT(CASE WHEN active = 1 THEN active ELSE NULL END) AS active_customers,
    COUNT(CASE WHEN active = 0 THEN active ELSE NULL END) AS inactive_customers,
    COUNT(*) AS total_customers
FROM customer
GROUP BY store_id;

Key Points:

  • The ELSE NULL clause ensures only matching conditions are counted
  • NULL values are not counted by the COUNT function
  • Total customers can be verified by summing active and inactive counts

JOIN Statements

Understanding JOINs

JOIN statements combine data from multiple tables based on related columns. This is similar to using lookup functions in Excel or relationship modeling in Power Query.

Common Types of JOINs:

  • INNER JOIN: Returns only matching records from both tables
  • LEFT JOIN: Returns all records from left table and matching records from right table (for vlookup how we use IFERROR function to show ”)
  • RIGHT JOIN: Returns all records from right table and matching records from left table
  • FULL OUTER JOIN: Returns all records when there is a match in either table

Table Naming Convention

When using JOINs, always specify the table name before the column:

table_name.column_name

This prevents ambiguity when multiple tables contain columns with the same name.

INNER JOIN Examples

Example 1: Inventory and Rental Relationship

Business Question: Which inventory items have been rented?

SELECT 
    inventory.inventory_id,
    rental.rental_id
FROM inventory
INNER JOIN rental ON inventory.inventory_id = rental.inventory_id;

Result: Returns 4,080 records showing inventory items that have been rented.

Example 2: Film Details with Inventory

Business Question: Show film titles and descriptions with their inventory details.

SELECT 
    inventory.inventory_id,
    inventory.store_id,
    film.title,
    film.description
FROM inventory
INNER JOIN film ON inventory.film_id = film.film_id;

Combining JOINs with Aggregate Functions

Example: Actor Film Count

Business Question: How many films has each actor appeared in?

SELECT 
    actor.actor_id,
    actor.first_name,
    actor.last_name,
    COUNT(film_actor.film_id) AS number_of_films
FROM actor
LEFT JOIN film_actor ON actor.actor_id = film_actor.actor_id
GROUP BY actor.actor_id, actor.first_name, actor.last_name
ORDER BY number_of_films DESC;

Using HAVING with JOINs

Business Question: Which actors have appeared in more than 30 films?

SELECT 
    actor.actor_id,
    actor.first_name,
    actor.last_name,
    COUNT(film_actor.film_id) AS number_of_films
FROM actor
LEFT JOIN film_actor ON actor.actor_id = film_actor.actor_id
GROUP BY actor.actor_id, actor.first_name, actor.last_name
HAVING COUNT(film_actor.film_id) >= 30
ORDER BY number_of_films DESC;

Key Point: Use HAVING for conditional filtering on aggregated data (after GROUP BY), while WHERE filters before aggregation.

LEFT JOIN vs INNER JOIN

Key Differences

INNER JOIN:

  • Returns only records that have matches in both tables
  • Excludes unmatched records

LEFT JOIN:

  • Returns all records from the left table
  • Includes matching records from the right table
  • Shows NULL for right table columns when no match exists

Example Comparison

-- INNER JOIN: Returns 4,080 records (only rented inventory)
SELECT 
    inventory.inventory_id,
    rental.rental_id
FROM inventory
INNER JOIN rental ON inventory.inventory_id = rental.inventory_id;
 
-- LEFT JOIN: Returns 4,551 records (all inventory, including not rented)
SELECT 
    inventory.inventory_id,
    rental.rental_id
FROM inventory
LEFT JOIN rental ON inventory.inventory_id = rental.inventory_id;

The difference of 471 records represents inventory items that have never been rented.

Visualizing JOINs

INNER JOIN: Returns only the intersection (common data) between tables

LEFT JOIN: Returns all data from left table plus matching data from right table (with NULL for non-matches)

RIGHT JOIN: Returns all data from right table plus matching data from left table (rarely used in practice)

Multiple Table JOINs

Three-Table JOIN Example

Business Question: What category does each film belong to?

Table Structure:

  • film table: Contains film_id, title, description
  • film_category table: Bridge table with film_id and category_id
  • category table: Contains category_id and category name
SELECT 
    film.title,
    film.description,
    category.name AS category_name
FROM film
INNER JOIN film_category ON film.film_id = film_category.film_id
INNER JOIN category ON film_category.category_id = category.category_id;

Key Concept: The film_category table acts as a bridge, connecting films to their categories.

UNION Statements

Understanding UNION

UNION combines results from two or more SELECT statements into a single result set. This is similar to the “Append Queries” function in Power Query.

Requirements:

  • All SELECT statements must have the same number of columns
  • Columns must have compatible data types
  • Column names can be different but should represent similar data

Example: Combining Advisors and Investors

SELECT 
    'Advisor' AS type,
    first_name,
    last_name
FROM advisor
 
UNION
 
SELECT 
    'Investor' AS type,
    first_name,
    last_name
FROM investor;

Result: A single table showing all advisors and investors with a type indicator.

Complex Query Example: Customer Lifetime Value

Business Question

Identify the most valuable customers by calculating:

  1. Customer names
  2. Total lifetime rentals
  3. Total payment amount

Required Tables

  • customer: Contains customer names
  • rental: Contains rental history
  • payment: Contains payment amounts

Solution

SELECT 
    customer.first_name,
    customer.last_name,
    COUNT(rental.rental_id) AS lifetime_rentals,
    SUM(payment.amount) AS total_payment
FROM customer
LEFT JOIN rental ON customer.customer_id = rental.customer_id
LEFT JOIN payment ON rental.rental_id = payment.rental_id
GROUP BY customer.customer_id, customer.first_name, customer.last_name
ORDER BY total_payment DESC;

Query Breakdown:

  1. Start with customer table to get all customers
  2. LEFT JOIN rental to count lifetime rentals
  3. LEFT JOIN payment to sum total payments
  4. GROUP BY customer details for aggregation
  5. ORDER BY total payment to identify most valuable customers

Key Takeaways

CASE Statements

  • Use for conditional logic within queries
  • Combine with aggregate functions for powerful analysis
  • Always specify NULL in ELSE clause when using with COUNT

JOIN Operations

  • INNER JOIN: Use when you only want matching records
  • LEFT JOIN: Use when you need all records from one table
  • RIGHT JOIN: Rarely used; rearrange tables and use LEFT JOIN instead
  • Always specify table names with columns to avoid ambiguity

Aggregate Functions with JOINs

  • Use GROUP BY when applying aggregate functions
  • Use HAVING for filtering aggregated results
  • Combine multiple layers (JOINs, aggregates, conditionals) for complex analysis

UNION

  • Use to combine results from multiple tables
  • Ensure column compatibility across all SELECT statements
  • Add type indicators to distinguish source tables

Practice Tips

  1. Always test queries incrementally (build complexity gradually)
  2. Use meaningful aliases for better readability
  3. Comment your queries to document logic
  4. Verify results against expected outcomes
  5. Practice with real-world business questions

References