Concurrency & Parallelism Patterns by Language

Type: Software Reference Confidence: 0.92 Sources: 7 Verified: 2026-02-24 Freshness: 2026-02-24

TL;DR

Constraints

Quick Reference

LanguageConcurrency ModelParallelism ModelGIL / LimitsBest For
Pythonasyncio (coroutines), threadingmultiprocessing, concurrent.futuresGIL blocks CPU parallelism in threadsI/O: asyncio; CPU: multiprocessing
JavaScript (Node.js)Event loop, Promises, async/awaitworker_threads, child_process, clusterSingle-threaded event loopI/O: async/await; CPU: worker_threads
GoGoroutines + channels (CSP model)Goroutines across OS threads (GOMAXPROCS)None -- true parallelism by defaultBoth I/O and CPU-bound work
JavaVirtual threads (Project Loom), CompletableFutureForkJoinPool, parallel streams, platform threadsNone -- true parallelismI/O: virtual threads; CPU: ForkJoinPool
Rustasync/await + tokio/smol runtimestd::thread, Rayon (data parallelism)None -- ownership prevents data races at compile timeI/O: tokio; CPU: Rayon or std::thread
C#async/await, Task, ValueTaskParallel.ForEach, Task.Run, PLINQNone -- true parallelismI/O: async/await; CPU: Parallel/TPL

Concurrency Primitives Comparison

PrimitivePythonNode.jsGoJavaRustC#
Coroutine/taskasync defasync functiongo func()Thread.startVirtualThread()tokio::spawn()Task.Run()
Channel/queueasyncio.QueueN/A (use streams)chanBlockingQueuetokio::sync::mpscChannel<T>
Mutex/lockthreading.LockN/A (single-threaded)sync.Mutexsynchronized / ReentrantLockstd::sync::Mutex<T>lock / SemaphoreSlim
AtomicN/AAtomics (SharedArrayBuffer)sync/atomicAtomicInteger etc.std::sync::atomicInterlocked
Thread poolThreadPoolExecutorWorker pool (manual)Runtime managesnewVirtualThreadPerTaskExecutor()tokio runtimeThreadPool / TPL
Parallel loopmultiprocessing.Pool.map()Promise.all() (I/O)errgroup.GroupparallelStream()rayon::par_iter()Parallel.ForEach()

Decision Tree

START
|-- Is the workload I/O-bound (network, disk, database)?
|   |-- YES
|   |   |-- Python? --> Use asyncio with async/await
|   |   |-- Node.js? --> Use Promises / async/await (built-in event loop)
|   |   |-- Go? --> Use goroutines + channels
|   |   |-- Java? --> Use virtual threads (Java 21+)
|   |   |-- Rust? --> Use tokio async runtime
|   |   +-- C#? --> Use async/await with Task
|   +-- NO (CPU-bound) v
|-- Is the workload CPU-bound (computation, data processing)?
|   |-- YES
|   |   |-- Python? --> Use multiprocessing or ProcessPoolExecutor
|   |   |-- Node.js? --> Use worker_threads with a worker pool
|   |   |-- Go? --> Use goroutines (parallel by default with GOMAXPROCS > 1)
|   |   |-- Java? --> Use ForkJoinPool or parallel streams
|   |   |-- Rust? --> Use Rayon for data parallelism or std::thread
|   |   +-- C#? --> Use Parallel.ForEach or Task.Run
|   +-- NO v
+-- Mixed workload? --> Separate I/O and CPU layers; use async for I/O,
    offload CPU to worker pool/process

Step-by-Step Guide

1. Identify workload type

Determine whether your bottleneck is I/O-bound (waiting for network/disk) or CPU-bound (processing data). Profile first. [src1]

# Python: profile to see where time is spent
python -m cProfile -s cumtime your_script.py

# Node.js: use built-in profiler
node --prof your_script.js
node --prof-process isolate-0x*.log > profile.txt

Verify: Look at output -- if most time is in socket.recv, http.get, or file I/O calls, your workload is I/O-bound.

2. Choose the right concurrency primitive

Match your language and workload type to the Quick Reference table above. [src2]

Verify: Run a benchmark with 100 concurrent tasks -- you should see near-linear scaling for I/O-bound work with async.

3. Implement structured concurrency

Group related concurrent tasks and ensure all complete (or fail) together. [src4]

# Python: structured concurrency with asyncio.TaskGroup (3.11+)
async with asyncio.TaskGroup() as tg:
    task1 = tg.create_task(fetch_url(url1))
    task2 = tg.create_task(fetch_url(url2))
# Both tasks guaranteed complete or cancelled here

Verify: If any task raises an exception, the entire group is cancelled.

4. Add error handling and cancellation

Every concurrent task must handle errors and support cancellation. Never fire-and-forget. [src7]

// Go: use errgroup for structured error handling
g, ctx := errgroup.WithContext(context.Background())
g.Go(func() error { return fetchURL(ctx, url1) })
g.Go(func() error { return fetchURL(ctx, url2) })
if err := g.Wait(); err != nil {
    log.Fatal(err) // first error cancels all goroutines via ctx
}

Verify: Introduce a deliberate error in one task -- confirm all others are cancelled.

Code Examples

Python: Async I/O with asyncio

# Input:  List of URLs to fetch concurrently
# Output: List of response bodies

import asyncio
import aiohttp  # aiohttp>=3.9

async def fetch_all(urls: list[str]) -> list[str]:
    async with aiohttp.ClientSession() as session:
        async def fetch(url: str) -> str:
            async with session.get(url) as resp:
                return await resp.text()
        return await asyncio.gather(*[fetch(u) for u in urls])

results = asyncio.run(fetch_all(["https://example.com"] * 10))

Python: CPU Parallelism with multiprocessing

# Input:  List of numbers to compute (CPU-bound)
# Output: List of results computed in parallel

from concurrent.futures import ProcessPoolExecutor
import math

def heavy_computation(n: int) -> float:
    return sum(math.sqrt(i) for i in range(n))

with ProcessPoolExecutor() as executor:
    results = list(executor.map(heavy_computation, [10**7] * 8))

Node.js: Worker Threads for CPU-bound work

// Input:  CPU-intensive computation
// Output: Result from worker thread

const { Worker, isMainThread, parentPort } = require("worker_threads");

if (isMainThread) {
  const worker = new Worker(__filename);
  worker.on("message", (result) => console.log("Result:", result));
  worker.postMessage({ n: 1e8 });
} else {
  parentPort.on("message", ({ n }) => {
    let sum = 0;
    for (let i = 0; i < n; i++) sum += Math.sqrt(i);
    parentPort.postMessage(sum);
  });
}

Go: Goroutines with Channels

// Input:  List of URLs to fetch concurrently
// Output: Collected results via channel

func fetchURL(url string, ch chan<- string, wg *sync.WaitGroup) {
    defer wg.Done()
    resp, err := http.Get(url)
    if err != nil {
        ch <- fmt.Sprintf("error: %v", err)
        return
    }
    defer resp.Body.Close()
    ch <- fmt.Sprintf("%s: %d", url, resp.StatusCode)
}

func main() {
    urls := []string{"https://go.dev", "https://pkg.go.dev"}
    ch := make(chan string, len(urls))
    var wg sync.WaitGroup
    for _, url := range urls {
        wg.Add(1)
        go fetchURL(url, ch, &wg)
    }
    go func() { wg.Wait(); close(ch) }()
    for result := range ch { fmt.Println(result) }
}

Java: Virtual Threads (Java 21+)

// Input:  List of tasks to run concurrently
// Output: Collected results via structured concurrency

// Virtual threads -- do NOT pool them, create fresh per task
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<String>> futures = List.of(
        executor.submit(() -> HttpClient.newHttpClient()
            .send(HttpRequest.newBuilder(URI.create("https://example.com")).build(),
                  HttpResponse.BodyHandlers.ofString()).body())
    );
    for (var f : futures) System.out.println(f.get().substring(0, 100));
}

Rust: Tokio Async Runtime

// Input:  List of URLs to fetch concurrently
// Output: Collected response statuses

// Cargo.toml: tokio = { version = "1", features = ["full"] }
//             reqwest = { version = "0.12", features = ["json"] }

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let urls = vec!["https://httpbin.org/get"; 5];
    let mut handles = vec![];
    for url in urls {
        handles.push(tokio::spawn(async move {
            let resp = reqwest::get(url).await?;
            Ok::<_, reqwest::Error>(resp.status())
        }));
    }
    for handle in handles {
        println!("Status: {}", handle.await??);
    }
    Ok(())
}

Anti-Patterns

Wrong: Shared mutable state without synchronization

# BAD -- race condition with shared counter across threads
counter = 0
def increment():
    global counter
    for _ in range(1_000_000):
        counter += 1  # not atomic -- lost updates

Correct: Use a lock or atomic operation

# GOOD -- use a Lock for thread-safe mutation
import threading
counter = 0
lock = threading.Lock()
def increment():
    global counter
    for _ in range(1_000_000):
        with lock:
            counter += 1

Wrong: Blocking the event loop

// BAD -- blocks the entire Node.js event loop
app.get("/compute", (req, res) => {
  let sum = 0;
  for (let i = 0; i < 1e9; i++) sum += Math.sqrt(i);
  res.json({ sum });
});

Correct: Offload CPU work to a worker thread

// GOOD -- offload to worker_threads
const { Worker } = require("worker_threads");
app.get("/compute", (req, res) => {
  const worker = new Worker("./compute-worker.js");
  worker.on("message", (sum) => res.json({ sum }));
  worker.on("error", (err) => res.status(500).json({ error: err.message }));
});

Wrong: Pooling Java virtual threads

// BAD -- defeats the purpose of virtual threads
ExecutorService pool = Executors.newFixedThreadPool(100);
pool.submit(() -> blockingIO()); // wastes platform threads

Correct: Use virtual-thread-per-task executor

// GOOD -- virtual threads are cheap; create one per task
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    executor.submit(() -> blockingIO()); // millions of these are fine
}

Wrong: Python threads for CPU-bound work

# BAD -- GIL prevents true parallel execution of CPU-bound threads
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=8) as executor:
    results = list(executor.map(heavy_cpu_work, data))
    # runs SLOWER than single-threaded due to GIL contention

Correct: Use ProcessPoolExecutor for CPU-bound work

# GOOD -- separate processes bypass the GIL
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor(max_workers=8) as executor:
    results = list(executor.map(heavy_cpu_work, data))
    # true parallelism across CPU cores

Common Pitfalls

Version History & Compatibility

Language/RuntimeVersionConcurrency MilestoneNotes
Python3.4 (2014)asyncio module addedBasic event loop
Python3.5 (2015)async/await syntaxNative coroutine syntax
Python3.11 (2022)asyncio.TaskGroupStructured concurrency
Python3.13 (2024)Free-threaded build (experimental)Optional GIL removal
Node.js10.5 (2018)worker_threads moduleExperimental
Node.js12 (2019)worker_threads stableProduction-ready
Go1.0 (2012)Goroutines + channelsCore feature since inception
Java21 (2023)Virtual threads GA (Project Loom)Replaces thread pools for I/O
Rust1.39 (2019)async/await stabilizedRequires external runtime
Rusttokio 1.0 (2020)Tokio 1.0 stableDe facto async runtime
C#.NET 4.5 (2012)async/await, TaskTPL-based
C#.NET 6 (2021)Parallel.ForEachAsyncAsync parallel loops

When to Use / When Not to Use

Use WhenDon't Use WhenUse Instead
Many I/O operations need to run concurrentlySimple sequential script with one I/O callSynchronous code
CPU-bound work can be split into independent chunksTask requires shared mutable state across workersSingle-threaded with optimized algorithm
Handling thousands of concurrent connectionsLow request volume (<100 concurrent)Simple thread-per-request or synchronous handler
Background processing while main thread stays responsiveComputation is inherently sequentialPipeline/streaming pattern
Need to saturate multi-core CPU for batch processingOverhead of spawning exceeds computation timeVectorized operations (NumPy, SIMD)

Important Caveats

Related Units