SQL window functions tutorial

- Bottom line: Window functions compute values across related rows without collapsing them -- use function() OVER (PARTITION BY ... ORDER BY ...) for rankings, running totals, and row comparisons.

SQL OVER PARTITION BY ORDER BY

- Bottom line: Window functions compute values across related rows without collapsing them -- use function() OVER (PARTITION BY ... ORDER BY ...) for rankings, running totals, and row comparisons.

SQL ROW_NUMBER RANK DENSE_RANK examples

- Bottom line: Window functions compute values across related rows without collapsing them -- use function() OVER (PARTITION BY ... ORDER BY ...) for rankings, running totals, and row comparisons.

SQL LAG LEAD running total

- Bottom line: Window functions compute values across related rows without collapsing them -- use function() OVER (PARTITION BY ... ORDER BY ...) for rankings, running totals, and row comparisons.

SQL analytic functions guide

- Bottom line: Window functions compute values across related rows without collapsing them -- use function() OVER (PARTITION BY ... ORDER BY ...) for rankings, running totals, and row comparisons.

SQL Window Functions: Running Totals, Rankings, Lag/Lead

How do I use SQL window functions (running totals, rankings, lag/lead)?

TL;DR

Bottom line: Window functions compute values across related rows without collapsing them -- use function() OVER (PARTITION BY ... ORDER BY ...) for rankings, running totals, and row comparisons.
Key tool/command: ROW_NUMBER() OVER (PARTITION BY group_col ORDER BY sort_col)
Watch out for: Default frame with ORDER BY is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which silently gives cumulative (not full-partition) results for SUM/AVG.
Works with: PostgreSQL 8.4+, MySQL 8.0+, SQL Server 2005+, Oracle 8i+, SQLite 3.25+, MariaDB 10.2+.

Constraints

Window functions can only appear in SELECT and ORDER BY clauses -- never in WHERE, GROUP BY, or HAVING. To filter on a window function result, wrap it in a CTE or subquery.
MySQL requires 8.0+ for window function support; MariaDB requires 10.2+. Earlier versions must use self-joins or variables.
LAST_VALUE with the default frame returns the current row's value, not the partition's last row. Always specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
Window functions execute after WHERE, GROUP BY, and HAVING in SQL's logical processing order. They see the filtered/grouped result set, not raw table data.
Adding ORDER BY inside OVER() silently changes the default frame from the entire partition to RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This causes SUM/AVG to return cumulative, not total, results.

Quick Reference

Scenario	Function	Syntax	Frame Needed	Trade-off
Unique row numbering	`ROW_NUMBER()`	`ROW_NUMBER() OVER (ORDER BY col)`	No	Arbitrary tiebreaker for duplicates
Ranking with gaps	`RANK()`	`RANK() OVER (ORDER BY col DESC)`	No	Tied rows get same rank, next rank skipped
Ranking without gaps	`DENSE_RANK()`	`DENSE_RANK() OVER (ORDER BY col DESC)`	No	Tied rows get same rank, no gap
Running total	`SUM() OVER`	`SUM(amt) OVER (ORDER BY dt ROWS UNBOUNDED PRECEDING)`	Yes	Must specify ROWS to avoid RANGE default peer behavior
Running average	`AVG() OVER`	`AVG(amt) OVER (ORDER BY dt ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)`	Yes	Sliding window -- adjust N for smoothing
Previous row value	`LAG()`	`LAG(col, 1, default) OVER (ORDER BY dt)`	No	Returns NULL if no prior row; set default
Next row value	`LEAD()`	`LEAD(col, 1, default) OVER (ORDER BY dt)`	No	Returns NULL if no following row; set default
First value in partition	`FIRST_VALUE()`	`FIRST_VALUE(col) OVER (PARTITION BY grp ORDER BY dt)`	Implicit	Default frame is fine for first value
Last value in partition	`LAST_VALUE()`	`LAST_VALUE(col) OVER (... ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)`	Yes, always	Default frame WRONG -- must override
Nth value in partition	`NTH_VALUE()`	`NTH_VALUE(col, 2) OVER (ORDER BY dt ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)`	Yes	Not available in SQL Server; use OFFSET/FETCH
Split into N buckets	`NTILE()`	`NTILE(4) OVER (ORDER BY score DESC)`	No	Uneven buckets when row count not divisible by N
Percentile rank	`PERCENT_RANK()`	`PERCENT_RANK() OVER (ORDER BY score)`	No	Range 0.0-1.0; first row always 0
Cumulative distribution	`CUME_DIST()`	`CUME_DIST() OVER (ORDER BY score)`	No	Range >0.0-1.0; fraction of rows <= current
Deduplicate rows	`ROW_NUMBER()` filter	`WHERE rn = 1` (in CTE)	No	Pick one row per group; deterministic ORDER BY needed

Decision Tree

START
|-- Need to assign a position/number to each row?
|   |-- YES: Need unique numbers (no ties)?
|   |   |-- YES --> ROW_NUMBER()
|   |   +-- NO: Need gaps after ties?
|   |       |-- YES --> RANK()
|   |       +-- NO --> DENSE_RANK()
|   +-- NO |
|-- Need to access a different row's value?
|   |-- YES: Previous row?
|   |   |-- YES --> LAG(col, offset, default)
|   |   +-- NO: Next row?
|   |       |-- YES --> LEAD(col, offset, default)
|   |       +-- NO: First/last in partition?
|   |           |-- FIRST --> FIRST_VALUE(col)
|   |           +-- LAST --> LAST_VALUE(col) + explicit ROWS frame
|   +-- NO |
|-- Need a running/cumulative calculation?
|   |-- YES: Running total?
|   |   |-- YES --> SUM(col) OVER (ORDER BY ... ROWS UNBOUNDED PRECEDING)
|   |   +-- NO: Sliding window (e.g., 3-row moving avg)?
|   |       |-- YES --> AVG(col) OVER (ORDER BY ... ROWS BETWEEN N PRECEDING AND CURRENT ROW)
|   |       +-- NO --> COUNT/MIN/MAX OVER (ORDER BY ... ROWS ...)
|   +-- NO |
|-- Need distribution/percentile analysis?
|   |-- YES: Split into equal buckets?
|   |   |-- YES --> NTILE(n)
|   |   +-- NO --> PERCENT_RANK() or CUME_DIST()
|   +-- NO |
+-- DEFAULT --> Check if GROUP BY with standard aggregates suffices

Step-by-Step Guide

1. Understand the OVER() clause anatomy

Every window function follows the pattern function() OVER (window_specification). The window specification has three optional parts: PARTITION BY (groups rows), ORDER BY (sorts within partitions), and frame clause (limits which rows the function sees). [src1]

-- Anatomy of a window function call
SELECT
    employee_id,
    department,
    salary,
    function_name(args) OVER (
        PARTITION BY department          -- (1) group rows
        ORDER BY salary DESC             -- (2) sort within groups
        ROWS BETWEEN UNBOUNDED PRECEDING -- (3) frame: which rows to include
             AND CURRENT ROW
    ) AS result
FROM employees;

Verify: Run EXPLAIN ANALYZE on your query -- look for "WindowAgg" nodes in the plan.

2. Choose the right ranking function

ROW_NUMBER gives unique sequential integers. RANK gives the same number to ties but skips. DENSE_RANK gives the same number to ties without skipping. [src2]

SELECT
    name, score,
    ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num,   -- 1,2,3,4
    RANK()       OVER (ORDER BY score DESC) AS rank_val,   -- 1,2,2,4  (gap after tie)
    DENSE_RANK() OVER (ORDER BY score DESC) AS dense_val   -- 1,2,2,3  (no gap)
FROM students;

Verify: Insert test data with ties and confirm: SELECT * FROM (above query) WHERE rank_val != dense_val; returns rows with ties.

3. Build running totals with explicit frame

Always specify the frame clause for cumulative aggregates. Without it, the default RANGE frame treats peers (same ORDER BY value) as one unit, which can give unexpected jumps. [src4]

SELECT
    order_date,
    amount,
    SUM(amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM orders;

Verify: The last row's running_total should equal SELECT SUM(amount) FROM orders;

4. Use LAG/LEAD for row-to-row comparisons

LAG looks backward, LEAD looks forward. Both accept an offset (default 1) and a default value for when no row exists. [src5]

SELECT
    month,
    revenue,
    LAG(revenue, 1, 0)  OVER (ORDER BY month) AS prev_month,
    LEAD(revenue, 1, 0) OVER (ORDER BY month) AS next_month,
    revenue - LAG(revenue, 1) OVER (ORDER BY month) AS month_over_month_change
FROM monthly_revenue;

Verify: First row's prev_month should be your default (0). Last row's next_month should be your default (0).

5. Define named windows for reuse

When multiple window functions share the same PARTITION BY and ORDER BY, define a named window to avoid repetition and ensure consistency. [src2]

SELECT
    department,
    employee,
    salary,
    RANK()       OVER w AS dept_rank,
    SUM(salary)  OVER w AS dept_running_total,
    LAG(salary)  OVER w AS prev_salary
FROM employees
WINDOW w AS (PARTITION BY department ORDER BY salary DESC);

Verify: Check that all three columns use the same partitioning: EXPLAIN should show a single sort operation.

6. Create a POC index for performance

For production queries, create an index that matches (P)ARTITION BY columns, then (O)RDER BY columns, then (c)overing columns as INCLUDE. This eliminates sorts and enables index-only scans. [src7]

-- POC index for: OVER (PARTITION BY department ORDER BY salary DESC)
CREATE INDEX idx_emp_window
    ON employees (department, salary DESC)
    INCLUDE (employee_id, name);

Verify: Run EXPLAIN ANALYZE and confirm no "Sort" node appears before "WindowAgg".

Code Examples

SQL: Running Total with Partitioned Reset

-- Input:  transactions table (account_id, txn_date, amount)
-- Output: running balance per account, ordered by date

SELECT
    account_id,
    txn_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY account_id
        ORDER BY txn_date, txn_id  -- txn_id breaks date ties deterministically
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_balance
FROM transactions
ORDER BY account_id, txn_date;

SQL: Top-N per Group (Deduplication Pattern)

-- Input:  products table (category, product_name, rating, review_count)
-- Output: top 3 products per category by rating

WITH ranked AS (
    SELECT
        category,
        product_name,
        rating,
        review_count,
        ROW_NUMBER() OVER (
            PARTITION BY category
            ORDER BY rating DESC, review_count DESC
        ) AS rn
    FROM products
)
SELECT category, product_name, rating, review_count
FROM ranked
WHERE rn <= 3
ORDER BY category, rn;

SQL: Gap Analysis with LAG/LEAD

-- Input:  sensor_readings table (sensor_id, reading_time, value)
-- Output: detect gaps > 5 minutes between consecutive readings

WITH gaps AS (
    SELECT
        sensor_id,
        reading_time,
        LAG(reading_time) OVER (
            PARTITION BY sensor_id
            ORDER BY reading_time
        ) AS prev_time,
        reading_time - LAG(reading_time) OVER (
            PARTITION BY sensor_id
            ORDER BY reading_time
        ) AS gap_interval
    FROM sensor_readings
)
SELECT sensor_id, prev_time, reading_time, gap_interval
FROM gaps
WHERE gap_interval > INTERVAL '5 minutes'
ORDER BY gap_interval DESC;

SQL: Moving Average and Cumulative Percentage

-- Input:  daily_sales table (sale_date, region, amount)
-- Output: 7-day moving average and cumulative % of total per region

SELECT
    sale_date,
    region,
    amount,
    ROUND(AVG(amount) OVER (
        PARTITION BY region
        ORDER BY sale_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ), 2) AS moving_avg_7d,
    ROUND(
        SUM(amount) OVER (
            PARTITION BY region
            ORDER BY sale_date
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) * 100.0 /
        SUM(amount) OVER (PARTITION BY region),
    2) AS cumulative_pct
FROM daily_sales
ORDER BY region, sale_date;

Anti-Patterns

Wrong: Using window function result in WHERE clause

-- BAD -- window functions cannot be used in WHERE
SELECT employee_id, department, salary,
       ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
WHERE rn <= 3;  -- ERROR: column "rn" does not exist

Correct: Wrap in CTE or subquery to filter

-- GOOD -- filter window function results via CTE
WITH ranked AS (
    SELECT employee_id, department, salary,
           ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
)
SELECT employee_id, department, salary
FROM ranked
WHERE rn <= 3;

Wrong: Relying on default frame with LAST_VALUE

-- BAD -- default frame with ORDER BY stops at current row
SELECT department, employee, salary,
       LAST_VALUE(employee) OVER (
           PARTITION BY department ORDER BY salary
       ) AS highest_paid  -- returns CURRENT row's employee, not the last!
FROM employees;

Correct: Specify full frame for LAST_VALUE

-- GOOD -- explicit frame includes all rows in partition
SELECT department, employee, salary,
       LAST_VALUE(employee) OVER (
           PARTITION BY department ORDER BY salary
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
       ) AS highest_paid
FROM employees;

Wrong: Non-deterministic ROW_NUMBER for deduplication

-- BAD -- ties produce arbitrary, non-reproducible numbering
SELECT *, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at) AS rn
FROM users;
-- Two users with same email AND same created_at: which gets rn=1 is random

Correct: Add tiebreaker column to ORDER BY

-- GOOD -- primary key as tiebreaker ensures deterministic results
SELECT *, ROW_NUMBER() OVER (PARTITION BY email ORDER BY created_at, user_id) AS rn
FROM users;

Wrong: Mixing ROWS and RANGE without understanding

-- BAD -- RANGE treats peers as one unit; three rows with same date
-- get the sum of ALL three, not a progressive running total
SELECT order_date, amount,
       SUM(amount) OVER (ORDER BY order_date) AS running_total
       -- implicit: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
FROM orders;
-- If 3 rows share order_date='2026-01-15', all 3 show the same running_total

Correct: Use ROWS for progressive cumulative sums

-- GOOD -- ROWS treats each physical row independently
SELECT order_date, amount,
       SUM(amount) OVER (
           ORDER BY order_date
           ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       ) AS running_total
FROM orders;
-- Each row increments independently even with duplicate dates

Wrong: Unnecessary self-join instead of window function

-- BAD -- self-join to get previous row is slow and fragile
SELECT a.id, a.value,
       b.value AS prev_value
FROM measurements a
LEFT JOIN measurements b
    ON b.id = (SELECT MAX(id) FROM measurements WHERE id < a.id);

Correct: Use LAG for previous row access

-- GOOD -- LAG is purpose-built for this, faster, and clearer
SELECT id, value,
       LAG(value) OVER (ORDER BY id) AS prev_value
FROM measurements;

Common Pitfalls

Default frame surprise: When ORDER BY is present inside OVER(), the default frame becomes RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. This silently turns SUM() into a cumulative sum instead of a partition total. Fix: always specify an explicit frame clause for aggregate window functions, or omit ORDER BY if you want the full partition. [src4]
LAST_VALUE returns current row: With the default frame, LAST_VALUE() sees only up to the current row, so it returns the current row's value. Fix: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. [src1]
NULLs in PARTITION BY: NULLs are grouped together in a single partition (SQL standard behavior). If this is wrong for your use case, filter NULLs out before the window function or use COALESCE. Fix: PARTITION BY COALESCE(group_col, 'unknown'). [src3]
ORDER BY in OVER vs final ORDER BY: The ORDER BY inside OVER() only determines the logical ordering for the window calculation -- it does NOT guarantee the output row order. Fix: always add a final ORDER BY clause to the outer query. [src2]
Performance: missing POC index: Without a matching index, every window function triggers a full sort. On large tables this can cause spills to disk. Fix: create a composite index matching (PARTITION BY cols, ORDER BY cols) with INCLUDE for covered columns. [src7]
MySQL 5.7 compatibility: Window functions do not exist in MySQL 5.7 or earlier. Queries using OVER() will fail with a syntax error. Fix: upgrade to MySQL 8.0+ or rewrite using variables (@row_number := @row_number + 1). [src3]
NTILE bucket imbalance: NTILE(n) distributes rows as evenly as possible, but when row count is not divisible by n, earlier buckets get one extra row. This is by design but can surprise users expecting exact quartiles. Fix: use PERCENT_RANK() or CUME_DIST() for exact percentile calculations. [src1]
Multiple different windows cause multiple sorts: Each distinct OVER() specification may require a separate sort pass. Fix: reuse named windows (WINDOW w AS (...)) and align PARTITION BY/ORDER BY across functions when possible. [src7]

Version History & Compatibility

Database	First Version	Status	Notes
PostgreSQL	8.4 (2009)	Full support since 8.4+	Named windows, GROUPS frame (v11+), FILTER (v9.4+)
MySQL	8.0 (2018)	Full support since 8.0+	No GROUPS frame clause; no FILTER clause
MariaDB	10.2 (2017)	Full support since 10.2+	Follows MySQL syntax closely
SQL Server	2005	Full support since 2012+	2005-2008 limited to ROW_NUMBER/RANK/DENSE_RANK; full frames in 2012+
Oracle	8i (1999)	Full support since 8i+	Pioneer of analytic functions; broadest syntax
SQLite	3.25 (2018)	Full support since 3.25+	All standard window functions supported
BigQuery	GA	Full support	Standard SQL syntax; QUALIFY clause for filtering
Snowflake	GA	Full support	QUALIFY clause; named windows; GROUPS frame

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Need per-row calculations alongside group-level aggregates	Simple aggregation per group is enough	GROUP BY with aggregate functions
Top-N per group / deduplication	Top-N of entire table (no partitioning needed)	ORDER BY + LIMIT
Running totals, moving averages, cumulative sums	Single total or average across all rows	Plain SUM()/AVG() aggregate
Comparing current row to previous/next row (gap detection)	Comparing to a fixed reference value	WHERE clause with constant/subquery
Ranking within categories (leaderboards)	Global rank with simple sort	ORDER BY + ROW_NUMBER in app code
Percentile/distribution analysis	Exact median (window PERCENTILE_CONT is non-standard)	Database-specific PERCENTILE_CONT() aggregate
Need both detail rows and aggregates in one query	Only need aggregate rows	GROUP BY is cleaner and faster

Important Caveats

Window functions are computed after WHERE, JOIN, GROUP BY, and HAVING. If you need to filter on a window function result, you must use a CTE or subquery -- there is no QUALIFY clause in standard SQL (BigQuery and Snowflake support it as an extension).
The RANGE frame mode operates on values (not physical row positions). It only works with numeric and date/timestamp ORDER BY columns. ROWS operates on physical row positions and works universally.
NTH_VALUE is not supported in SQL Server. Use a combination of ROW_NUMBER + CTE as a workaround.
Performance of window functions degrades with partition size. If partitions contain millions of rows, consider pre-aggregating or limiting the frame with ROWS BETWEEN N PRECEDING AND CURRENT ROW.
SQLite supports window functions but lacks some advanced features like GROUPS frame mode and FILTER clause.