topological ordering algorithm

- Bottom line: Topological sort produces a linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every edge (u, v), vertex u appears before v -- essential for dependency resolution, build systems, and task scheduling.

Kahn's algorithm BFS topological sort

- Bottom line: Topological sort produces a linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every edge (u, v), vertex u appears before v -- essential for dependency resolution, build systems, and task scheduling.

DFS topological sort implementation

- Bottom line: Topological sort produces a linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every edge (u, v), vertex u appears before v -- essential for dependency resolution, build systems, and task scheduling.

dependency graph ordering

- Bottom line: Topological sort produces a linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every edge (u, v), vertex u appears before v -- essential for dependency resolution, build systems, and task scheduling.

Topological Sort: Linear Ordering of DAG Vertices

How do I implement topological sort and when to use it?

TL;DR

Bottom line: Topological sort produces a linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every edge (u, v), vertex u appears before v -- essential for dependency resolution, build systems, and task scheduling.
Key tool/command: Kahn's algorithm (BFS with in-degree tracking) or DFS with post-order reversal.
Watch out for: Applying topological sort to a graph with cycles -- always detect cycles first or use Kahn's algorithm which detects them automatically (sorted count != vertex count).
Works with: Any directed acyclic graph; language-agnostic algorithm (O(V+E) time, O(V) space).

Constraints

Input graph MUST be a Directed Acyclic Graph (DAG); topological sort is undefined for graphs with cycles
A DAG may have multiple valid topological orderings; algorithms return one valid ordering unless explicitly enumerating all
Always run cycle detection before or during topological sort; silently ignoring cycles produces incorrect orderings
All-orderings enumeration is O(n! * V) worst case; never enumerate on graphs with more than ~20 vertices without pruning

Quick Reference

Algorithm	Time	Space	Cycle Detection	All Orderings	Approach
Kahn's (BFS)	O(V+E)	O(V)	Yes (count != V)	No (one ordering)	In-degree queue; remove zero-indegree nodes iteratively
DFS-based	O(V+E)	O(V)	Yes (back-edge / color)	No (one ordering)	Post-order traversal, reverse the result
DFS + backtracking	O(V! * V)	O(V)	Yes	Yes (all orderings)	Enumerate all valid orderings via backtracking
Parallel Kahn's	O(V+E) work, O(D) depth	O(V)	Yes	No	Process all zero-indegree nodes in parallel per level
Coffman-Graham	O(V+E)	O(V)	Yes	No	Kahn's variant with lexicographic tie-breaking for scheduling

Decision Tree

START
├── Is the graph guaranteed to be a DAG (no cycles)?
│   ├── NO → Use Kahn's algorithm (detects cycles when sorted_count != V)
│   └── YES ↓
├── Do you need to identify parallelizable groups (levels)?
│   ├── YES → Use Kahn's algorithm with level tracking
│   └── NO ↓
├── Do you need ALL valid topological orderings?
│   ├── YES → Use DFS + backtracking (only for small graphs, V <= 20)
│   └── NO ↓
├── Is the implementation language naturally recursive?
│   ├── YES → Use DFS-based topological sort (simpler code)
│   └── NO → Use Kahn's algorithm (iterative, no stack overflow risk)
└── DEFAULT → Kahn's algorithm (most versatile, built-in cycle detection)

Step-by-Step Guide

1. Build the adjacency list and compute in-degrees

Create an adjacency list representation and count incoming edges for each vertex. [src3]

from collections import deque, defaultdict

def build_graph(num_vertices, edges):
    adj = defaultdict(list)
    in_degree = [0] * num_vertices
    for u, v in edges:
        adj[u].append(v)
        in_degree[v] += 1
    return adj, in_degree

Verify: assert sum(in_degree) == len(edges) -- in-degree sum equals total edge count.

2. Initialize queue with zero-indegree vertices

All vertices with in-degree 0 have no dependencies and can appear first. [src3]

queue = deque(v for v in range(num_vertices) if in_degree[v] == 0)

Verify: If queue is empty but graph has vertices, the graph contains a cycle.

3. Process vertices in BFS order (Kahn's core loop)

Dequeue a vertex, add it to the result, and decrement in-degrees of neighbors. Enqueue neighbors that reach in-degree 0. [src1]

result = []
while queue:
    node = queue.popleft()
    result.append(node)
    for neighbor in adj[node]:
        in_degree[neighbor] -= 1
        if in_degree[neighbor] == 0:
            queue.append(neighbor)

Verify: len(result) == num_vertices -- if not, cycle exists.

4. DFS-based alternative

Visit each unvisited vertex, recurse into neighbors, push vertex after all neighbors processed, then reverse. Uses 3-color marking for cycle detection. [src4]

WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * num_vertices
order = []

def dfs(node):
    color[node] = GRAY
    for neighbor in adj[node]:
        if color[neighbor] == GRAY:
            raise ValueError("Cycle detected")
        if color[neighbor] == WHITE:
            dfs(neighbor)
    color[node] = BLACK
    order.append(node)

for v in range(num_vertices):
    if color[v] == WHITE:
        dfs(v)
order.reverse()

Verify: No ValueError raised and len(order) == num_vertices.

Code Examples

Python: Course Schedule (LeetCode 210 pattern)

# Input:  numCourses=4, prerequisites=[[1,0],[2,0],[3,1],[3,2]]
# Output: [0, 1, 2, 3] or [0, 2, 1, 3] (any valid ordering)

from collections import deque, defaultdict

def find_course_order(num_courses, prerequisites):
    adj = defaultdict(list)
    in_degree = [0] * num_courses
    for course, prereq in prerequisites:
        adj[prereq].append(course)
        in_degree[course] += 1

    queue = deque(c for c in range(num_courses) if in_degree[c] == 0)
    order = []

    while queue:
        course = queue.popleft()
        order.append(course)
        for next_course in adj[course]:
            in_degree[next_course] -= 1
            if in_degree[next_course] == 0:
                queue.append(next_course)

    return order if len(order) == num_courses else []  # empty = cycle

JavaScript: Build Order / Task Scheduling

// Input:  tasks = ["a","b","c","d"], deps = [["a","b"],["a","c"],["b","d"]]
// Output: ["a", "b", "c", "d"] or ["a", "c", "b", "d"]

function buildOrder(tasks, deps) {
  const adj = new Map();
  const inDeg = new Map();
  for (const t of tasks) { adj.set(t, []); inDeg.set(t, 0); }

  for (const [pre, dep] of deps) {
    adj.get(pre).push(dep);
    inDeg.set(dep, inDeg.get(dep) + 1);
  }

  const queue = tasks.filter(t => inDeg.get(t) === 0);
  const order = [];
  let i = 0;

  while (i < queue.length) {
    const node = queue[i++];
    order.push(node);
    for (const neighbor of adj.get(node)) {
      inDeg.set(neighbor, inDeg.get(neighbor) - 1);
      if (inDeg.get(neighbor) === 0) queue.push(neighbor);
    }
  }

  if (order.length !== tasks.length) throw new Error("Cycle detected");
  return order;
}

Java: Dependency Resolution with Parallel Level Detection

// Input:  numNodes=6, edges={{5,2},{5,0},{4,0},{4,1},{2,3},{3,1}}
// Output: levels [[4,5],[0,2],[3],[1]] -- nodes at each level can run in parallel

import java.util.*;

public class ParallelTopSort {
    public static List<List<Integer>> topSortLevels(int n, int[][] edges) {
        List<List<Integer>> adj = new ArrayList<>();
        int[] inDeg = new int[n];
        for (int i = 0; i < n; i++) adj.add(new ArrayList<>());

        for (int[] e : edges) {
            adj.get(e[0]).add(e[1]);
            inDeg[e[1]]++;
        }

        Queue<Integer> queue = new LinkedList<>();
        for (int i = 0; i < n; i++)
            if (inDeg[i] == 0) queue.add(i);

        List<List<Integer>> levels = new ArrayList<>();
        int processed = 0;

        while (!queue.isEmpty()) {
            List<Integer> level = new ArrayList<>();
            int size = queue.size();
            for (int i = 0; i < size; i++) {
                int node = queue.poll();
                level.add(node);
                processed++;
                for (int nb : adj.get(node)) {
                    if (--inDeg[nb] == 0) queue.add(nb);
                }
            }
            levels.add(level);
        }

        if (processed != n) throw new RuntimeException("Cycle detected");
        return levels;
    }
}

Anti-Patterns

Wrong: No cycle detection in Kahn's algorithm

# BAD -- silently returns partial ordering on cyclic graphs
def naive_topo_sort(n, edges):
    adj, in_degree = build_graph(n, edges)
    queue = deque(v for v in range(n) if in_degree[v] == 0)
    result = []
    while queue:
        node = queue.popleft()
        result.append(node)
        for nb in adj[node]:
            in_degree[nb] -= 1
            if in_degree[nb] == 0:
                queue.append(nb)
    return result  # May return partial list without warning!

Correct: Verify sorted count equals vertex count

# GOOD -- detects cycles by checking result length
def safe_topo_sort(n, edges):
    adj, in_degree = build_graph(n, edges)
    queue = deque(v for v in range(n) if in_degree[v] == 0)
    result = []
    while queue:
        node = queue.popleft()
        result.append(node)
        for nb in adj[node]:
            in_degree[nb] -= 1
            if in_degree[nb] == 0:
                queue.append(nb)
    if len(result) != n:
        raise ValueError("Graph contains a cycle")
    return result

Wrong: Incorrect in-degree computation

# BAD -- counting both directions inflates in-degrees
def broken_indegree(n, edges):
    in_degree = [0] * n
    for u, v in edges:
        in_degree[v] += 1
        in_degree[u] += 1  # Wrong! u is source, not destination
    return in_degree

Correct: Only increment destination vertex in-degree

# GOOD -- only the destination vertex gains an incoming edge
def correct_indegree(n, edges):
    in_degree = [0] * n
    for u, v in edges:
        in_degree[v] += 1  # Edge u -> v means v has one more incoming edge
    return in_degree

Wrong: DFS without 3-color marking (misses cycles)

# BAD -- simple visited set cannot distinguish back-edges from cross-edges
def broken_dfs_topo(n, adj):
    visited = set()
    order = []
    def dfs(node):
        visited.add(node)
        for nb in adj[node]:
            if nb not in visited:
                dfs(nb)
        order.append(node)
    for v in range(n):
        if v not in visited:
            dfs(v)
    order.reverse()
    return order  # No cycle detection at all!

Correct: Use WHITE/GRAY/BLACK coloring for back-edge detection

# GOOD -- GRAY node in DFS stack means back-edge (cycle)
def correct_dfs_topo(n, adj):
    WHITE, GRAY, BLACK = 0, 1, 2
    color = [WHITE] * n
    order = []
    def dfs(node):
        color[node] = GRAY
        for nb in adj[node]:
            if color[nb] == GRAY:
                raise ValueError("Cycle: back edge found")
            if color[nb] == WHITE:
                dfs(nb)
        color[node] = BLACK
        order.append(node)
    for v in range(n):
        if color[v] == WHITE:
            dfs(v)
    order.reverse()
    return order

Common Pitfalls

Not handling disconnected components: Both Kahn's and DFS must iterate over ALL vertices, not just start from vertex 0. Fix: iterate for v in range(V) in DFS; Kahn's handles this naturally. [src2]
Confusing edge direction in prerequisites: In course schedule problems, [course, prereq] means prereq -> course. Reversing direction produces wrong orderings. Fix: read edge semantics carefully. [src5]
Stack overflow on large graphs with DFS: Recursive DFS hits Python's default 1000-recursion limit. Fix: use iterative DFS with explicit stack, or use Kahn's algorithm. [src4]
Assuming unique topological ordering: A DAG may have many valid orderings. Fix: use heapq or priority queue for deterministic (lexicographically smallest) output. [src1]
Mutating the input graph: Some implementations modify in-degree arrays in place. Fix: work on copies; never remove edges from the original adjacency list. [src3]
Applying to undirected graphs: Every undirected edge creates a "cycle" between its endpoints. Fix: verify graph is directed before attempting topological sort. [src1]

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Resolving build/compilation dependencies (Makefile, Gradle, Bazel)	Graph has cycles	Strongly connected components (Tarjan/Kosaraju)
Course prerequisite scheduling	Need shortest path in weighted graph	Dijkstra or Bellman-Ford
Task scheduling with dependency constraints	Need to find connected components	Union-Find / BFS/DFS connected components
Package manager install ordering (npm, pip, apt)	Graph is undirected	BFS/DFS traversal
Spreadsheet formula evaluation order	Need maximum flow or matching	Ford-Fulkerson or Hopcroft-Karp
Determining parallelizable task levels	Need all-pairs shortest paths	Floyd-Warshall
Data pipeline / ETL stage ordering	Ordering doesn't need to respect edge direction	Simple BFS/DFS traversal

Important Caveats

Topological sort is a fundamental algorithm dating to Kahn (1962); the algorithm itself is evergreen and will not change, but language-specific implementations may vary in edge cases (e.g., Python recursion limits, Java stream vs. imperative style)
For extremely large graphs (millions of vertices), consider parallel variants of Kahn's algorithm that process entire levels concurrently
The DFS-based approach and Kahn's algorithm produce different (but both valid) orderings for the same graph; do not assume they will match
In competitive programming, topological sort is a prerequisite for DAG dynamic programming (e.g., longest path in DAG, number of paths)