Overview
SQL is no longer just a reporting language.
In modern applications—payments, filings, customer analytics, fraud detection, and compliance—SQL directly powers business decisions.
Bad SQL can:
- Slow down production databases
- Inflate AWS RDS / Redshift bills
- Break dashboards
- Produce wrong financial numbers
Good SQL can:
- Make systems faster
- Reduce cloud cost
- Improve developer productivity
- Build trust in business data
This tutorial teaches how SQL should be written in real systems, not just how to make queries run.
What You Will Learn
By the end of this guide, you will know how to:
- Write clean, readable, professional SQL
- Fetch only the data you actually need
- Break complex logic into CTEs
- Avoid slow subqueries
- Write index-friendly filters
- Prevent duplicate data and wrong metrics
- Use
COUNTcorrectly in business reports
This is designed for:
Our Real-World Data Model
We will use a realistic business schema:
customers(customer_id, name, email, state)
orders(order_id, customer_id, order_date, status, total_amount)
payments(payment_id, order_id, payment_date, amount, payment_status)
Think of this like:
Business Requirement
Show total revenue and number of paid orders per state for the last 30 days.
This is exactly the kind of query used in finance, growth, and compliance dashboards.
Write Clean & Formatted SQL
❌ Bad (hard to read, risky to change)
select c.state,sum(p.amount),count(o.order_id)
from customers c,orders o,payments p
where c.customer_id=o.customer_id and o.order_id=p.order_id
and date(p.payment_date)>=date(now()-interval 30 day)
group by c.state;
✅ Good (production-grade)
SELECT
c.state,
SUM(p.amount) AS total_revenue,
COUNT(o.order_id) AS paid_orders
FROM customers c
JOIN orders o
ON c.customer_id = o.customer_id
JOIN payments p
ON o.order_id = p.order_id
WHERE p.payment_date >= CURRENT_DATE - INTERVAL 30 DAY
AND p.payment_status = 'SUCCESS'
GROUP BY c.state;
Clean SQL saves hours during:
Fetch Only What You Need
❌ Bad
SELECT *
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL 30 DAY;
Large tables may have 50+ columns → slow & expensive.
✅ Good
SELECT
order_id,
customer_id,
total_amount,
status
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL 30 DAY;
Less data = faster queries = lower cloud cost.
Split Logic Using CTEs
When queries grow, CTEs keep them readable.
Business rule:
WITH paid_orders AS (
SELECT
o.order_id,
o.customer_id
FROM orders o
JOIN payments p
ON o.order_id = p.order_id
WHERE p.payment_status = 'SUCCESS'
AND p.payment_date >= CURRENT_DATE - INTERVAL 30 DAY
AND o.status = 'COMPLETED'
),
state_revenue AS (
SELECT
c.state,
p.amount
FROM paid_orders po
JOIN customers c
ON po.customer_id = c.customer_id
JOIN payments p
ON po.order_id = p.order_id
)
SELECT
state,
SUM(amount) AS total_revenue
FROM state_revenue
GROUP BY state;
CTEs let teams:
Prefer JOINs Over Subqueries
❌ Bad (runs repeatedly per row)
SELECT
c.state,
(SELECT SUM(p.amount)
FROM payments p
WHERE p.order_id IN (
SELECT o.order_id
FROM orders o
WHERE o.customer_id = c.customer_id
)
) AS revenue
FROM customers c;
✅ Good
SELECT
c.state,
SUM(p.amount) AS revenue
FROM customers c
JOIN orders o
ON c.customer_id = o.customer_id
JOIN payments p
ON o.order_id = p.order_id
GROUP BY c.state;
Fewer scans = faster dashboards.
Write Index-Friendly Filters
Assume payment_date is indexed.❌ Bad
WHERE DATE(payment_date) = '2026-01-01'
This disables the index.
✅ Good
WHERE payment_date >= '2026-01-01'
AND payment_date < '2026-01-02'
This uses the index → huge performance gain.
Don’t Use DISTINCT to Hide Bad Joins
❌ Bad
SELECT DISTINCT c.customer_id
FROM customers c
JOIN orders o
ON c.customer_id = o.customer_id
JOIN payments p
ON o.order_id = p.order_id;
This hides data duplication.
✅ Correct
SELECT c.customer_id
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
JOIN payments p
ON o.order_id = p.order_id
WHERE o.customer_id = c.customer_id
AND p.payment_status = 'SUCCESS'
);
This returns only real paying customers.
COUNT(*) vs COUNT(column)
SELECT
COUNT(*) AS total_orders,
COUNT(payment_id) AS paid_orders
FROM orders o
LEFT JOIN payments p
ON o.order_id = p.order_id;
If 100 orders exist but only 80 are paid:
COUNT(*) = 100COUNT(payment_id) = 80
Using the wrong one leads to wrong revenue reports.
Why This Matters
In production systems:
- Finance uses these numbers
- Marketing runs campaigns
- Compliance files reports
- Leadership makes decisions
One wrong query can mean:
- Wrong revenue
- Wrong growth numbers
- Wrong business strategy
Good SQL = trustworthy data.
