Sorting Algorithms Compared: Complete Reference
What are the sorting algorithms compared (time, space, stability, when to use)?
TL;DR
- Bottom line: Use your language's stdlib sort (TimSort/IntroSort) for 99% of cases; only reach for specialized algorithms when you have domain-specific constraints like integer-only keys or external storage.
- Key tool/command:
sorted()(Python),Arrays.sort()(Java),Array.prototype.sort()(JS),sort.Slice()(Go) - Watch out for: Worst-case O(n²) with naive quicksort on sorted input; always use randomized pivot or IntroSort variant.
- Works with: All programming languages; algorithm properties are language-agnostic.
Constraints
- Comparison-based sorts provably cannot beat O(n log n) average-case time complexity (information-theoretic lower bound). [src1]
- Linear-time sorts (radix, counting, bucket) only apply when input satisfies specific conditions: bounded integer range, fixed-width keys, or uniform distribution. [src6]
- In-place sorting (O(1) extra space) and stability cannot always be achieved simultaneously -- choose one based on your requirements. [src1]
- Unstable sorts (quicksort, heapsort, introsort) may reorder equal elements -- never use them when multi-key sort order must be preserved. [src2]
- Naive quicksort degrades to O(n²) on already-sorted or reverse-sorted input; production implementations must use randomized pivots, median-of-three, or IntroSort. [src1]
Quick Reference
| Algorithm | Best | Average | Worst | Space | Stable | Adaptive | When to Use |
|---|---|---|---|---|---|---|---|
| QuickSort | O(n log n) | O(n log n) | O(n²) | O(log n) | No | No | General-purpose; fastest in practice for random data |
| MergeSort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes | No | Stability required; linked lists; external sorting |
| HeapSort | O(n log n) | O(n log n) | O(n log n) | O(1) | No | No | Guaranteed O(n log n) with O(1) space; embedded systems |
| TimSort | O(n) | O(n log n) | O(n log n) | O(n) | Yes | Yes | Python/Java stdlib; best for real-world partially-sorted data |
| IntroSort | O(n log n) | O(n log n) | O(n log n) | O(log n) | No | No | C++ STL / .NET stdlib; quicksort with heapsort fallback |
| RadixSort | O(nk) | O(nk) | O(nk) | O(n + k) | Yes | No | Fixed-width integers/strings; k = number of digits |
| CountingSort | O(n + k) | O(n + k) | O(n + k) | O(k) | Yes | No | Small integer range [0, k); subroutine for radix sort |
| BucketSort | O(n + k) | O(n + k) | O(n²) | O(n + k) | Yes* | No | Uniformly distributed floating-point values |
| InsertionSort | O(n) | O(n²) | O(n²) | O(1) | Yes | Yes | Small arrays (n < 20-50); nearly sorted data; online sorting |
| ShellSort | O(n log n) | O(n4/3) | O(n3/2) | O(1) | No | Yes | Embedded systems; better than insertion sort for medium n |
*BucketSort stability depends on the inner sort used. TimSort's O(n) best case occurs when input has existing sorted runs. QuickSort's O(log n) space is for the recursion stack. [src1] [src2] [src7]
Decision Tree
START
|-- Is data already partially sorted or has natural runs?
| |-- YES --> TimSort (stdlib sorted()/list.sort() in Python, Arrays.sort() for objects in Java)
| +-- NO v
|-- Are keys small integers with bounded range [0, k)?
| |-- YES --> k < n? --> CountingSort
| | k >= n but fixed-width? --> RadixSort
| +-- NO v
|-- Are values uniformly distributed floats in known range?
| |-- YES --> BucketSort
| +-- NO v
|-- Is stability required (preserving order of equal elements)?
| |-- YES --> MergeSort (or TimSort via stdlib)
| +-- NO v
|-- Is memory constrained (must be in-place)?
| |-- YES --> Is worst-case guarantee needed?
| | |-- YES --> HeapSort (O(n log n) guaranteed, O(1) space)
| | +-- NO --> IntroSort (quicksort + heapsort fallback)
| +-- NO v
|-- Is dataset very small (n < 50)?
| |-- YES --> InsertionSort
| +-- NO v
+-- DEFAULT --> Use stdlib sort (TimSort or IntroSort depending on language)
Step-by-Step Guide
1. Implement QuickSort with randomized pivot
QuickSort is the most commonly studied sorting algorithm and the basis for many stdlib implementations. The critical detail is pivot selection -- always randomize to avoid O(n²) worst case. [src1]
import random
def quicksort(arr, low=0, high=None):
if high is None:
high = len(arr) - 1
if low < high:
pivot_idx = partition(arr, low, high)
quicksort(arr, low, pivot_idx - 1)
quicksort(arr, pivot_idx + 1, high)
def partition(arr, low, high):
# Randomized pivot avoids O(n^2) on sorted input
pivot_idx = random.randint(low, high)
arr[pivot_idx], arr[high] = arr[high], arr[pivot_idx]
pivot = arr[high]
i = low - 1
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[high] = arr[high], arr[i + 1]
return i + 1
Verify: arr = [3,6,8,10,1,2,1]; quicksort(arr); print(arr) --> expected: [1, 1, 2, 3, 6, 8, 10]
2. Implement MergeSort for guaranteed O(n log n) with stability
MergeSort divides the array into halves, sorts each recursively, and merges. It guarantees O(n log n) in all cases and is stable, but uses O(n) extra space. [src1]
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]: # <= preserves stability
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
Verify: print(merge_sort([38, 27, 43, 3, 9, 82, 10])) --> expected: [3, 9, 10, 27, 38, 43, 82]
3. Implement CountingSort for bounded integer ranges
CountingSort achieves O(n + k) time by counting occurrences rather than comparing. Requires knowing the range of input values. [src6]
def counting_sort(arr, max_val=None):
if not arr:
return arr
if max_val is None:
max_val = max(arr)
count = [0] * (max_val + 1)
for num in arr:
count[num] += 1
output = []
for val in range(max_val + 1):
output.extend([val] * count[val])
return output
Verify: print(counting_sort([4, 2, 2, 8, 3, 3, 1])) --> expected: [1, 2, 2, 3, 3, 4, 8]
4. Implement RadixSort for fixed-width integer keys
RadixSort processes digits from least significant to most significant, using a stable sort (counting sort) as a subroutine. [src6]
def radix_sort(arr):
if not arr:
return arr
max_val = max(arr)
exp = 1
while max_val // exp > 0:
arr = counting_sort_by_digit(arr, exp)
exp *= 10
return arr
def counting_sort_by_digit(arr, exp):
n = len(arr)
output = [0] * n
count = [0] * 10
for num in arr:
idx = (num // exp) % 10
count[idx] += 1
for i in range(1, 10):
count[i] += count[i - 1]
for i in range(n - 1, -1, -1):
idx = (arr[i] // exp) % 10
output[count[idx] - 1] = arr[i]
count[idx] -= 1
return output
Verify: print(radix_sort([170, 45, 75, 90, 802, 24, 2, 66])) --> expected: [2, 24, 45, 66, 75, 90, 170, 802]
Code Examples
Python: stdlib sorting with custom comparators
# Input: list of objects to sort
# Output: sorted list (TimSort: stable, adaptive, O(n log n))
from functools import cmp_to_key
# Basic sort (TimSort under the hood)
nums = [5, 2, 8, 1, 9]
sorted_nums = sorted(nums) # Returns new list
nums.sort() # Sorts in-place
# Sort by key function
students = [("Alice", 85), ("Bob", 92), ("Charlie", 85)]
by_grade = sorted(students, key=lambda s: s[1], reverse=True)
# [('Bob', 92), ('Alice', 85), ('Charlie', 85)]
# Stable: Alice stays before Charlie (both 85)
# Multi-key sort: primary descending grade, secondary ascending name
by_multi = sorted(students, key=lambda s: (-s[1], s[0]))
JavaScript: Array.prototype.sort with gotchas
// Input: array to sort
// Output: sorted array (TimSort in V8/SpiderMonkey since ~2019)
// GOTCHA: Default sort is lexicographic, not numeric!
[10, 9, 8].sort(); // [10, 8, 9] -- WRONG
[10, 9, 8].sort((a, b) => a - b); // [8, 9, 10] -- correct
// Sort objects by property
const items = [
{ name: "Banana", price: 1.50 },
{ name: "Apple", price: 0.99 },
{ name: "Cherry", price: 2.00 }
];
items.sort((a, b) => a.price - b.price);
// Stable sort (guaranteed since ES2019 / V8 7.0+)
// For immutable sort: arr.toSorted() (ES2023+)
Java: Arrays.sort and Collections.sort
// Input: array or List to sort
// Output: sorted in-place
// Arrays.sort(primitives) = Dual-Pivot QuickSort (unstable)
// Arrays.sort(objects) = TimSort (stable)
import java.util.Arrays;
import java.util.Comparator;
int[] primitives = {5, 2, 8, 1};
Arrays.sort(primitives); // Dual-Pivot QuickSort
String[] names = {"Charlie", "Alice", "Bob"};
Arrays.sort(names); // TimSort (stable)
// Custom comparator
Integer[] nums = {5, 2, 8, 1};
Arrays.sort(nums, Comparator.reverseOrder());
Go: sort.Slice and sort.SliceStable
// Input: slice to sort
// Output: sorted in-place
// sort.Slice = IntroSort variant (unstable)
// sort.SliceStable = MergeSort variant (stable)
package main
import (
"fmt"
"sort"
)
func main() {
nums := []int{5, 2, 8, 1, 9}
sort.Ints(nums) // [1, 2, 5, 8, 9]
// Stable sort
type Student struct {
Name string
Grade int
}
s := []Student{{"Alice", 85}, {"Bob", 92}, {"Charlie", 85}}
sort.SliceStable(s, func(i, j int) bool {
return s[i].Grade > s[j].Grade
})
fmt.Println(s) // [{Bob 92} {Alice 85} {Charlie 85}]
}
Anti-Patterns
Wrong: Using bubble sort in production code
# BAD -- O(n^2) average and worst case; never use for real data
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(0, n - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
return arr
Correct: Use stdlib sort
# GOOD -- TimSort: O(n log n), stable, adaptive, optimized in C
arr = [5, 3, 8, 1, 2]
arr.sort() # In-place, fast, production-ready
Wrong: JavaScript numeric sort without comparator
// BAD -- default sort is lexicographic
const prices = [100, 25, 3, 1000];
prices.sort();
// Result: [100, 1000, 25, 3] -- WRONG
Correct: Always provide numeric comparator
// GOOD -- explicit numeric comparison
const prices = [100, 25, 3, 1000];
prices.sort((a, b) => a - b);
// Result: [3, 25, 100, 1000]
Wrong: Naive quicksort pivot selection
# BAD -- always picking last element as pivot
# Degrades to O(n^2) on sorted or reverse-sorted input
def bad_partition(arr, low, high):
pivot = arr[high] # Always last element
# sorted input => every partition is maximally unbalanced
Correct: Randomized or median-of-three pivot
# GOOD -- randomized pivot prevents adversarial worst case
import random
def good_partition(arr, low, high):
pivot_idx = random.randint(low, high)
arr[pivot_idx], arr[high] = arr[high], arr[pivot_idx]
pivot = arr[high]
# balanced partitions in expectation
Wrong: Ignoring stability when sorting by multiple keys
# BAD -- using unstable sort then wondering why secondary order is lost
import heapq
data = [("Alice", 85), ("Bob", 85), ("Charlie", 90)]
# heapsort is not stable; Alice/Bob order not preserved within grade 85
Correct: Use stable sort or sort by composite key
# GOOD -- Python's sorted() is stable (TimSort)
data = [("Alice", 85), ("Bob", 85), ("Charlie", 90)]
result = sorted(data, key=lambda x: (-x[1], x[0]))
# [('Charlie', 90), ('Alice', 85), ('Bob', 85)]
Common Pitfalls
- Assuming O(n log n) is always optimal: For small n (< 50), InsertionSort's low overhead often beats QuickSort/MergeSort due to cache locality. That's why TimSort and IntroSort use InsertionSort for small subarrays. Fix: let stdlib handle the threshold (typically 16-64 elements). [src4]
- Using quicksort on nearly-sorted data without safeguards: Naive quicksort with first/last pivot degrades to O(n²). Fix: use randomized pivot or IntroSort which falls back to HeapSort. [src1]
- Choosing radix sort without checking key width: RadixSort is O(nk) where k = digits. For 64-bit integers, k=20 (base 10). If n is small, nk > n log n. Fix: only use radix sort when n >> k, typically n > 10,000 with bounded keys. [src6]
- Forgetting JavaScript sort mutates the array:
Array.prototype.sort()sorts in-place and returns the same reference. Fix: use[...arr].sort()orarr.toSorted()(ES2023+) for a new array. [src1] - Ignoring external sorting for data exceeding RAM: MergeSort is the only standard algorithm that works efficiently with sequential access (disk/tape). Fix: use external merge sort (k-way merge with disk buffers). [src3]
- Assuming all stdlib sorts are the same: Python uses TimSort (stable), Java uses Dual-Pivot QuickSort for primitives (unstable) but TimSort for objects (stable), C++ uses IntroSort (unstable). Fix: check your language's docs for stability guarantees. [src4]
Version History & Compatibility
| Language/Runtime | Default Sort | Stable | Notes |
|---|---|---|---|
| Python 2.3+ (2002) | TimSort | Yes | Created by Tim Peters; adaptive merge sort + insertion sort |
| Java 7+ (2011) | TimSort (objects) / Dual-Pivot QS (primitives) | Objects only | Arrays.sort(int[]) unstable; Arrays.sort(Object[]) stable |
| V8 / Chrome 70+ (2018) | TimSort | Yes | Replaced QuickSort; Array.prototype.sort() now guaranteed stable |
| C++ STL (GCC/Clang) | IntroSort | No | std::sort unstable; std::stable_sort uses MergeSort |
| .NET / C# | IntroSort | No | Array.Sort unstable; OrderBy (LINQ) is stable |
| Go 1.19+ (2022) | pdqsort | No | sort.Slice unstable; sort.SliceStable uses merge sort |
| Swift 5+ (2019) | TimSort | Yes | Replaced IntroSort with TimSort |
| Rust | TimSort variant | Yes | sort() stable; sort_unstable() uses pdqsort |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| General-purpose sorting of any comparable data | Keys are small bounded integers | CountingSort or RadixSort |
| Need stable sort with good average performance | Memory is extremely constrained (embedded) | HeapSort (O(1) space, unstable) |
| Data has natural runs or is partially sorted | Data is truly random and you need minimal overhead | IntroSort (avoids merge overhead) |
| Sorting linked lists or external data on disk | Random access is available and cheap | QuickSort-based in-place sort |
| Fixed-width integer keys with n > 10,000 | Keys are complex objects or variable-length strings | Comparison-based sort with custom comparator |
| Need worst-case O(n log n) guarantee (real-time) | Average case is sufficient | HeapSort or MergeSort |
Important Caveats
- Algorithm benchmarks vary dramatically by hardware, cache size, and data distribution. Theoretical complexity does not capture constant factors -- always benchmark on your actual data if performance is critical. [src7]
- TimSort exploits existing runs in data (common in real-world datasets) and achieves O(n) on fully sorted input. Random benchmark data understates TimSort's practical advantage. [src4]
- Parallel sorting (parallel merge sort, parallel quicksort) can provide significant speedups on multi-core hardware but introduces complexity and overhead for small datasets.
- The choice between stable and unstable sort has correctness implications, not just performance. If you sort records by multiple fields sequentially, an unstable sort on the second field may destroy the ordering from the first sort.
- For string sorting, locale-aware comparison (
localeComparein JS,Collatorin Java) can be significantly slower than byte-wise comparison. Profile before optimizing.