SQL Window Functions: Running Totals, Rankings, Lag/Lead
How do I use SQL window functions (running totals, rankings, lag/lead)?
TL;DR
- Bottom line: Window functions compute values across related rows without collapsing them -- use
function() OVER (PARTITION BY ... ORDER BY ...)for rankings, running totals, and row comparisons. - Key tool/command:
ROW_NUMBER() OVER (PARTITION BY group_col ORDER BY sort_col) - Watch out for: Default frame with ORDER BY is
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which silently gives cumulative (not full-partition) results for SUM/AVG. - Works with: PostgreSQL 8.4+, MySQL 8.0+, SQL Server 2005+, Oracle 8i+, SQLite 3.25+, MariaDB 10.2+.
Constraints
- Window functions can only appear in SELECT and ORDER BY clauses -- never in WHERE, GROUP BY, or HAVING. To filter on a window function result, wrap it in a CTE or subquery.
- MySQL requires 8.0+ for window function support; MariaDB requires 10.2+. Earlier versions must use self-joins or variables.
- LAST_VALUE with the default frame returns the current row's value, not the partition's last row. Always specify
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. - Window functions execute after WHERE, GROUP BY, and HAVING in SQL's logical processing order. They see the filtered/grouped result set, not raw table data.
- Adding ORDER BY inside OVER() silently changes the default frame from the entire partition to
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This causes SUM/AVG to return cumulative, not total, results.
Quick Reference
| Scenario | Function | Syntax | Frame Needed | Trade-off |
|---|---|---|---|---|
| Unique row numbering | ROW_NUMBER() | ROW_NUMBER() OVER (ORDER BY col) | No | Arbitrary tiebreaker for duplicates |
| Ranking with gaps | RANK() | RANK() OVER (ORDER BY col DESC) | No | Tied rows get same rank, next rank skipped |
| Ranking without gaps | DENSE_RANK() | DENSE_RANK() OVER (ORDER BY col DESC) | No | Tied rows get same rank, no gap |
| Running total | SUM() OVER | SUM(amt) OVER (ORDER BY dt ROWS UNBOUNDED PRECEDING) | Yes | Must specify ROWS to avoid RANGE default peer behavior |
| Running average | AVG() OVER | AVG(amt) OVER (ORDER BY dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) | Yes | Sliding window -- adjust N for smoothing |
| Previous row value | LAG() | LAG(col, 1, default) OVER (ORDER BY dt) | No | Returns NULL if no prior row; set default |
| Next row value | LEAD() | LEAD(col, 1, default) OVER (ORDER BY dt) | No | Returns NULL if no following row; set default |
| First value in partition | FIRST_VALUE() | FIRST_VALUE(col) OVER (PARTITION BY grp ORDER BY dt) | Implicit | Default frame is fine for first value |
| Last value in partition | LAST_VALUE() | LAST_VALUE(col) OVER (... ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) | Yes, always | Default frame WRONG -- must override |
| Nth value in partition | NTH_VALUE() | NTH_VALUE(col, 2) OVER (ORDER BY dt ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) | Yes | Not available in SQL Server; use OFFSET/FETCH |
| Split into N buckets | NTILE() | NTILE(4) OVER (ORDER BY score DESC) | No | Uneven buckets when row count not divisible by N |
| Percentile rank | PERCENT_RANK() | PERCENT_RANK() OVER (ORDER BY score) | No | Range 0.0-1.0; first row always 0 |
| Cumulative distribution | CUME_DIST() | CUME_DIST() OVER (ORDER BY score) | No | Range >0.0-1.0; fraction of rows <= current |
| Deduplicate rows | ROW_NUMBER() filter | WHERE rn = 1 (in CTE) | No | Pick one row per group; deterministic ORDER BY needed |
Decision Tree
START
|-- Need to assign a position/number to each row?
| |-- YES: Need unique numbers (no ties)?
| | |-- YES --> ROW_NUMBER()
| | +-- NO: Need gaps after ties?
| | |-- YES --> RANK()
| | +-- NO --> DENSE_RANK()
| +-- NO |
|-- Need to access a different row's value?
| |-- YES: Previous row?
| | |-- YES --> LAG(col, offset, default)
| | +-- NO: Next row?
| | |-- YES --> LEAD(col, offset, default)
| | +-- NO: First/last in partition?
| | |-- FIRST --> FIRST_VALUE(col)
| | +-- LAST --> LAST_VALUE(col) + explicit ROWS frame
| +-- NO |
|-- Need a running/cumulative calculation?
| |-- YES: Running total?
| | |-- YES --> SUM(col) OVER (ORDER BY ... ROWS UNBOUNDED PRECEDING)
| | +-- NO: Sliding window (e.g., 3-row moving avg)?
| | |-- YES --> AVG(col) OVER (ORDER BY ... ROWS BETWEEN N PRECEDING AND CURRENT ROW)
| | +-- NO --> COUNT/MIN/MAX OVER (ORDER BY ... ROWS ...)
| +-- NO |
|-- Need distribution/percentile analysis?
| |-- YES: Split into equal buckets?
| | |-- YES --> NTILE(n)
| | +-- NO --> PERCENT_RANK() or CUME_DIST()
| +-- NO |
+-- DEFAULT --> Check if GROUP BY with standard aggregates suffices
Step-by-Step Guide
1. Understand the OVER() clause anatomy
Every window function follows the pattern function() OVER (window_specification). The window specification has three optional parts: PARTITION BY (groups rows), ORDER BY (sorts within partitions), and frame clause (limits which rows the function sees). [src1]
-- Anatomy of a window function call
SELECT
employee_id,
department,
salary,
function_name(args) OVER (
PARTITION BY department -- (1) group rows
ORDER BY salary DESC -- (2) sort within groups
ROWS BETWEEN UNBOUNDED PRECEDING -- (3) frame: which rows to include
AND CURRENT ROW
) AS result
FROM employees;
Verify: Run EXPLAIN ANALYZE on your query -- look for "WindowAgg" nodes in the plan.
2. Choose the right ranking function
ROW_NUMBER gives unique sequential integers. RANK gives the same number to ties but skips. DENSE_RANK gives the same number to ties without skipping. [src2]
SELECT
name, score,
ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num, -- 1,2,3,4
RANK() OVER (ORDER BY score DESC) AS rank_val, -- 1,2,2,4 (gap after tie)
DENSE_RANK() OVER (ORDER BY score DESC) AS dense_val -- 1,2,2,3 (no gap)
FROM students;
Verify: Insert test data with ties and confirm: SELECT * FROM (above query) WHERE rank_val != dense_val; returns rows with ties.
3. Build running totals with explicit frame
Always specify the frame clause for cumulative aggregates. Without it, the default RANGE frame treats peers (same ORDER BY value) as one unit, which can give unexpected jumps. [src4]
SELECT
order_date,
amount,
SUM(amount) OVER (
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total
FROM orders;
Verify: The last row's running_total should equal SELECT SUM(amount) FROM orders;
4. Use LAG/LEAD for row-to-row comparisons
LAG looks backward, LEAD looks forward. Both accept an offset (default 1) and a default value for when no row exists. [src5]
SELECT
month,
revenue,
LAG(revenue, 1, 0) OVER (ORDER BY month) AS prev_month,
LEAD(revenue, 1, 0) OVER (ORDER BY month) AS next_month,
revenue - LAG(revenue, 1) OVER (ORDER BY month) AS month_over_month_change
FROM monthly_revenue;
Verify: First row's prev_month should be your default (0). Last row's next_month should be your default (0).
5. Define named windows for reuse
When multiple window functions share the same PARTITION BY and ORDER BY, define a named window to avoid repetition and ensure consistency. [src2]
SELECT
department,
employee,
salary,
RANK() OVER w AS dept_rank,
SUM(salary) OVER w AS dept_running_total,
LAG(salary) OVER w AS prev_salary
FROM employees
WINDOW w AS (PARTITION BY department ORDER BY salary DESC);
Verify: Check that all three columns use the same partitioning: EXPLAIN should show a single sort operation.
6. Create a POC index for performance
For production queries, create an index that matches (P)ARTITION BY columns, then (O)RDER BY columns, then (c)overing columns as INCLUDE. This eliminates sorts and enables index-only scans. [src7]
-- POC index for: OVER (PARTITION BY department ORDER BY salary DESC)
CREATE INDEX idx_emp_window
ON employees (department, salary DESC)
INCLUDE (employee_id, name);
Verify: Run EXPLAIN ANALYZE and confirm no "Sort" node appears before "WindowAgg".
Code Examples
SQL: Running Total with Partitioned Reset
-- Input: transactions table (account_id, txn_date, amount)
-- Output: running balance per account, ordered by date
SELECT
account_id,
txn_date,
amount,
SUM(amount) OVER (
PARTITION BY account_id
ORDER BY txn_date, txn_id -- txn_id breaks date ties deterministically
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_balance
FROM transactions
ORDER BY account_id, txn_date;
SQL: Top-N per Group (Deduplication Pattern)
-- Input: products table (category, product_name, rating, review_count)
-- Output: top 3 products per category by rating
WITH ranked AS (
SELECT
category,
product_name,
rating,
review_count,
ROW_NUMBER() OVER (
PARTITION BY category
ORDER BY rating DESC, review_count DESC
) AS rn
FROM products
)
SELECT category, product_name, rating, review_count
FROM ranked
WHERE rn <= 3
ORDER BY category, rn;
SQL: Gap Analysis with LAG/LEAD
-- Input: sensor_readings table (sensor_id, reading_time, value)
-- Output: detect gaps > 5 minutes between consecutive readings
WITH gaps AS (
SELECT
sensor_id,
reading_time,
LAG(reading_time) OVER (
PARTITION BY sensor_id
ORDER BY reading_time
) AS prev_time,
reading_time - LAG(reading_time) OVER (
PARTITION BY sensor_id
ORDER BY reading_time
) AS gap_interval
FROM sensor_readings
)
SELECT sensor_id, prev_time, reading_time, gap_interval
FROM gaps
WHERE gap_interval > INTERVAL '5 minutes'
ORDER BY gap_interval DESC;
SQL: Moving Average and Cumulative Percentage
-- Input: daily_sales table (sale_date, region, amount)
-- Output: 7-day moving average and cumulative % of total per region
SELECT
sale_date,
region,
amount,
ROUND(AVG(amount) OVER (
PARTITION BY region
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
), 2) AS moving_avg_7d,
ROUND(
SUM(amount) OVER (
PARTITION BY region
ORDER BY sale_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) * 100.0 /
SUM(amount) OVER (PARTITION BY region),
2) AS cumulative_pct
FROM daily_sales
ORDER BY region, sale_date;
Anti-Patterns
Wrong: Using window function result in WHERE clause
-- BAD -- window functions cannot be used in WHERE
SELECT employee_id, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
WHERE rn <= 3; -- ERROR: column "rn" does not exist
Correct: Wrap in CTE or subquery to filter
-- GOOD -- filter window function results via CTE
WITH ranked AS (
SELECT employee_id, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
)
SELECT employee_id, department, salary
FROM ranked
WHERE rn <= 3;
Wrong: Relying on default frame with LAST_VALUE
-- BAD -- default frame with ORDER BY stops at current row
SELECT department, employee, salary,
LAST_VALUE(employee) OVER (
PARTITION BY department ORDER BY salary
) AS highest_paid -- returns CURRENT row's employee, not the last!
FROM employees;
Correct: Specify full frame for LAST_VALUE
-- GOOD -- explicit frame includes all rows in partition
SELECT department, employee, salary,
LAST_VALUE(employee) OVER (
PARTITION BY department ORDER BY salary
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS highest_paid
FROM employees;
Wrong: Non-deterministic ROW_NUMBER for deduplication
-- BAD -- ties produce arbitrary, non-reproducible numbering
SELECT *, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at) AS rn
FROM users;
-- Two users with same email AND same created_at: which gets rn=1 is random
Correct: Add tiebreaker column to ORDER BY
-- GOOD -- primary key as tiebreaker ensures deterministic results
SELECT *, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at, user_id) AS rn
FROM users;
Wrong: Mixing ROWS and RANGE without understanding
-- BAD -- RANGE treats peers as one unit; three rows with same date
-- get the sum of ALL three, not a progressive running total
SELECT order_date, amount,
SUM(amount) OVER (ORDER BY order_date) AS running_total
-- implicit: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
FROM orders;
-- If 3 rows share order_date='2026-01-15', all 3 show the same running_total
Correct: Use ROWS for progressive cumulative sums
-- GOOD -- ROWS treats each physical row independently
SELECT order_date, amount,
SUM(amount) OVER (
ORDER BY order_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total
FROM orders;
-- Each row increments independently even with duplicate dates
Wrong: Unnecessary self-join instead of window function
-- BAD -- self-join to get previous row is slow and fragile
SELECT a.id, a.value,
b.value AS prev_value
FROM measurements a
LEFT JOIN measurements b
ON b.id = (SELECT MAX(id) FROM measurements WHERE id < a.id);
Correct: Use LAG for previous row access
-- GOOD -- LAG is purpose-built for this, faster, and clearer
SELECT id, value,
LAG(value) OVER (ORDER BY id) AS prev_value
FROM measurements;
Common Pitfalls
- Default frame surprise: When ORDER BY is present inside OVER(), the default frame becomes
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This silently turns SUM() into a cumulative sum instead of a partition total. Fix: always specify an explicit frame clause for aggregate window functions, or omit ORDER BY if you want the full partition. [src4] - LAST_VALUE returns current row: With the default frame, LAST_VALUE() sees only up to the current row, so it returns the current row's value. Fix:
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. [src1] - NULLs in PARTITION BY: NULLs are grouped together in a single partition (SQL standard behavior). If this is wrong for your use case, filter NULLs out before the window function or use COALESCE. Fix:
PARTITION BY COALESCE(group_col, 'unknown'). [src3] - ORDER BY in OVER vs final ORDER BY: The ORDER BY inside OVER() only determines the logical ordering for the window calculation -- it does NOT guarantee the output row order. Fix: always add a final
ORDER BYclause to the outer query. [src2] - Performance: missing POC index: Without a matching index, every window function triggers a full sort. On large tables this can cause spills to disk. Fix: create a composite index matching (PARTITION BY cols, ORDER BY cols) with INCLUDE for covered columns. [src7]
- MySQL 5.7 compatibility: Window functions do not exist in MySQL 5.7 or earlier. Queries using OVER() will fail with a syntax error. Fix: upgrade to MySQL 8.0+ or rewrite using variables (
@row_number := @row_number + 1). [src3] - NTILE bucket imbalance: NTILE(n) distributes rows as evenly as possible, but when row count is not divisible by n, earlier buckets get one extra row. This is by design but can surprise users expecting exact quartiles. Fix: use PERCENT_RANK() or CUME_DIST() for exact percentile calculations. [src1]
- Multiple different windows cause multiple sorts: Each distinct OVER() specification may require a separate sort pass. Fix: reuse named windows (
WINDOW w AS (...)) and align PARTITION BY/ORDER BY across functions when possible. [src7]
Version History & Compatibility
| Database | First Version | Status | Notes |
|---|---|---|---|
| PostgreSQL | 8.4 (2009) | Full support since 8.4+ | Named windows, GROUPS frame (v11+), FILTER (v9.4+) |
| MySQL | 8.0 (2018) | Full support since 8.0+ | No GROUPS frame clause; no FILTER clause |
| MariaDB | 10.2 (2017) | Full support since 10.2+ | Follows MySQL syntax closely |
| SQL Server | 2005 | Full support since 2012+ | 2005-2008 limited to ROW_NUMBER/RANK/DENSE_RANK; full frames in 2012+ |
| Oracle | 8i (1999) | Full support since 8i+ | Pioneer of analytic functions; broadest syntax |
| SQLite | 3.25 (2018) | Full support since 3.25+ | All standard window functions supported |
| BigQuery | GA | Full support | Standard SQL syntax; QUALIFY clause for filtering |
| Snowflake | GA | Full support | QUALIFY clause; named windows; GROUPS frame |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Need per-row calculations alongside group-level aggregates | Simple aggregation per group is enough | GROUP BY with aggregate functions |
| Top-N per group / deduplication | Top-N of entire table (no partitioning needed) | ORDER BY + LIMIT |
| Running totals, moving averages, cumulative sums | Single total or average across all rows | Plain SUM()/AVG() aggregate |
| Comparing current row to previous/next row (gap detection) | Comparing to a fixed reference value | WHERE clause with constant/subquery |
| Ranking within categories (leaderboards) | Global rank with simple sort | ORDER BY + ROW_NUMBER in app code |
| Percentile/distribution analysis | Exact median (window PERCENTILE_CONT is non-standard) | Database-specific PERCENTILE_CONT() aggregate |
| Need both detail rows and aggregates in one query | Only need aggregate rows | GROUP BY is cleaner and faster |
Important Caveats
- Window functions are computed after WHERE, JOIN, GROUP BY, and HAVING. If you need to filter on a window function result, you must use a CTE or subquery -- there is no QUALIFY clause in standard SQL (BigQuery and Snowflake support it as an extension).
- The RANGE frame mode operates on values (not physical row positions). It only works with numeric and date/timestamp ORDER BY columns. ROWS operates on physical row positions and works universally.
- NTH_VALUE is not supported in SQL Server. Use a combination of ROW_NUMBER + CTE as a workaround.
- Performance of window functions degrades with partition size. If partitions contain millions of rows, consider pre-aggregating or limiting the frame with ROWS BETWEEN N PRECEDING AND CURRENT ROW.
- SQLite supports window functions but lacks some advanced features like GROUPS frame mode and FILTER clause.