CWE-502 deserialization of untrusted data

- Bottom line: Never deserialize untrusted data using native serialization formats -- use data-only formats like JSON instead, and apply allowlist class filtering as defense-in-depth when native deserialization is unavoidable.

prevent deserialization attacks

- Bottom line: Never deserialize untrusted data using native serialization formats -- use data-only formats like JSON instead, and apply allowlist class filtering as defense-in-depth when native deserialization is unavoidable.

Java ObjectInputStream vulnerability fix

- Bottom line: Never deserialize untrusted data using native serialization formats -- use data-only formats like JSON instead, and apply allowlist class filtering as defense-in-depth when native deserialization is unavoidable.

Python pickle security

- Bottom line: Never deserialize untrusted data using native serialization formats -- use data-only formats like JSON instead, and apply allowlist class filtering as defense-in-depth when native deserialization is unavoidable.

Insecure Deserialization Prevention Guide

How do I prevent insecure deserialization vulnerabilities?

TL;DR

Bottom line: Never deserialize untrusted data using native serialization formats -- use data-only formats like JSON instead, and apply allowlist class filtering as defense-in-depth when native deserialization is unavoidable.
Key tool/command: Replace ObjectInputStream (Java), pickle.loads() (Python), unserialize() (PHP), BinaryFormatter (.NET), Marshal.load (Ruby) with JSON-based alternatives.
Watch out for: Validating data after deserialization is too late -- malicious code executes during the deserialization process itself, not after.
Works with: All languages and frameworks. CWE-502 / OWASP A08:2021 (Software and Data Integrity Failures).

Constraints

NEVER deserialize data from untrusted sources using native serialization formats (Java ObjectInputStream, Python pickle, PHP unserialize, .NET BinaryFormatter, Ruby Marshal)
Allowlist-based class filtering is defense-in-depth only -- new gadget chains are discovered regularly and bypass denylists
NEVER use eval(), new Function(), or node-serialize on user-supplied input in any language
Signing/HMAC on serialized data prevents tampering but does NOT prevent exploitation if the signing key is compromised
Input validation AFTER deserialization is too late -- the attack executes during deserialization, not after

Quick Reference

Deserialization Threat/Fix Checklist by Language

#	Language	Vulnerable API	Risk	Safe Alternative	Notes
1	Java	`ObjectInputStream.readObject()`	Critical -- RCE via gadget chains (ysoserial)	JSON (Jackson/Gson), XML (JAXB), Protobuf	Override resolveClass() for allowlisting if unavoidable
2	Java	`XMLDecoder` with user input	Critical -- arbitrary method invocation	JAXB, StAX, DOM parsers	Never use XMLDecoder with untrusted data
3	Java	`XStream` < 1.4.17	Critical -- RCE via type manipulation	XStream >= 1.4.17 with allowlist, or Jackson	Set XStream.allowTypes() explicitly
4	Python	`pickle.loads()` / `pickle.load()`	Critical -- arbitrary code via __reduce__	`json.loads()`, msgpack, protobuf	No safe way to use pickle with untrusted data
5	Python	`yaml.load()` (PyYAML)	Critical -- code execution via !!python/object	`yaml.safe_load()`	PyYAML >= 6.0 defaults to safe_load
6	PHP	`unserialize()`	Critical -- object injection via magic methods	`json_decode()` / `json_encode()`	POP chains exploit __wakeup, __destruct
7	.NET	`BinaryFormatter.Deserialize()`	Critical -- RCE, cannot be secured	`System.Text.Json`, `DataContractSerializer`	Removed entirely in .NET 9
8	.NET	`TypeNameHandling != None` (JSON.Net)	High -- type confusion RCE	Set `TypeNameHandling.None`	Default is None; never change for untrusted data
9	Ruby	`Marshal.load()`	Critical -- arbitrary object instantiation	`JSON.parse()`, `Oj.safe_load`	No safe configuration exists
10	Node.js	`node-serialize` / `serialize-to-js`	Critical -- RCE via IIFE injection	`JSON.parse()` / `JSON.stringify()`	Avoid any library that serializes functions
11	Any	`eval()` on serialized strings	Critical -- direct code execution	Language-specific safe parsers	Never eval user input in any context

Decision Tree

START: Does your application deserialize data from untrusted sources?
├── NO → No insecure deserialization risk (verify trust boundaries)
├── YES
    ├── Can you switch to a data-only format (JSON, XML, Protobuf)?
    │   ├── YES → Replace native serialization with JSON/Protobuf (BEST option)
    │   └── NO
    │       ├── Java? → Override resolveClass() with allowlist + use ObjectInputFilter (Java 9+)
    │       ├── Python? → Replace pickle with json.loads(); use yaml.safe_load() for YAML
    │       ├── .NET? → Migrate to System.Text.Json; BinaryFormatter removed in .NET 9
    │       ├── PHP? → Replace unserialize() with json_decode()
    │       ├── Ruby? → Replace Marshal.load with JSON.parse; use Oj.safe_load
    │       └── Node.js? → Use JSON.parse/JSON.stringify only; remove node-serialize
    └── Add defense-in-depth: HMAC signing, integrity checks, monitoring

Step-by-Step Guide

1. Audit your codebase for deserialization sinks

Identify every location where untrusted data is deserialized. Use static analysis or grep for known dangerous patterns. [src2]

# Java: find ObjectInputStream usage
grep -rn 'ObjectInputStream\|readObject\|XMLDecoder\|XStream' --include="*.java" .

# Python: find pickle and unsafe YAML
grep -rn 'pickle\.load\|pickle\.loads\|yaml\.load\|jsonpickle' --include="*.py" .

# PHP: find unserialize calls
grep -rn 'unserialize(' --include="*.php" .

# .NET: find BinaryFormatter and unsafe TypeNameHandling
grep -rn 'BinaryFormatter\|TypeNameHandling' --include="*.cs" .

# Node.js: find node-serialize and eval-based deserialization
grep -rn "node-serialize\|serialize-to-js\|eval(" --include="*.js" .

Verify: Count all findings -- each one is a potential RCE vulnerability.

2. Replace native serialization with JSON

Switch from native object serialization to data-only formats. JSON cannot represent executable code, eliminating the entire attack class. [src1]

# Python: BEFORE (vulnerable)
import pickle
data = pickle.loads(user_input)  # RCE vulnerability

# Python: AFTER (safe)
import json
data = json.loads(user_input)  # Data only, no code execution

Verify: Run audit commands again -- dangerous patterns should be eliminated.

3. Apply allowlist filtering when native deserialization is unavoidable

Restrict which classes can be deserialized using built-in filtering mechanisms. [src1]

// Java 9+: ObjectInputFilter (built-in)
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
    "com.myapp.dto.*;!*"  // Allow only com.myapp.dto, deny all else
);
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);

Verify: Attempt to deserialize a non-allowlisted class -- should throw InvalidClassException.

4. Sign serialized data with HMAC

When you must serialize data that will be stored or transmitted, sign it to detect tampering. [src3]

import hmac, hashlib, json
SECRET_KEY = b'your-secret-key-from-env'

def serialize_signed(data: dict) -> str:
    payload = json.dumps(data, sort_keys=True)
    sig = hmac.new(SECRET_KEY, payload.encode(), hashlib.sha256).hexdigest()
    return f"{payload}|{sig}"

def deserialize_verified(signed_data: str) -> dict:
    payload, sig = signed_data.rsplit('|', 1)
    expected = hmac.new(SECRET_KEY, payload.encode(), hashlib.sha256).hexdigest()
    if not hmac.compare_digest(sig, expected):
        raise ValueError("Tampered data detected")
    return json.loads(payload)

Verify: Modify any byte in the signed payload -- should raise ValueError.

5. Monitor and log deserialization events

Add observability to detect exploitation attempts in production. [src2]

ObjectInputFilter loggingFilter = filterInfo -> {
    Class<?> clazz = filterInfo.serialClass();
    if (clazz != null) {
        logger.info("Deserialization attempt: {}", clazz.getName());
        if (!ALLOWED_CLASSES.contains(clazz.getName())) {
            logger.warn("BLOCKED: {}", clazz.getName());
            return ObjectInputFilter.Status.REJECTED;
        }
    }
    return ObjectInputFilter.Status.ALLOWED;
};

Verify: Check logs for BLOCKED entries after deploying the filter.

Code Examples

Python: Safe Data Exchange with JSON

# Input:  Untrusted user data (API request body, file upload)
# Output: Validated Python dict

import json
from dataclasses import dataclass

def deserialize_user_data(raw: str) -> dict:
    try:
        data = json.loads(raw)  # Safe -- data only
    except json.JSONDecodeError:
        raise ValueError("Invalid JSON input")
    if not isinstance(data, dict):
        raise ValueError("Expected JSON object")
    return data

# UNSAFE -- NEVER use with untrusted data:
# pickle.loads(raw)       -- arbitrary code execution
# yaml.load(raw)          -- code exec via !!python/object
# jsonpickle.decode(raw)  -- code exec via py/object

Java: Jackson with Strict Type Binding

// Input:  Untrusted JSON string from HTTP request
// Output: Validated DTO object

import com.fasterxml.jackson.databind.ObjectMapper;  // 2.15+
import com.fasterxml.jackson.databind.DeserializationFeature;

public class SafeDeserializer {
    private static final ObjectMapper mapper = new ObjectMapper()
        .disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
        // CRITICAL: never enable default typing
        // mapper.enableDefaultTyping() <-- NEVER DO THIS

    public static <T> T deserialize(String json, Class<T> type) {
        try {
            return mapper.readValue(json, type);
        } catch (Exception e) {
            throw new IllegalArgumentException("Invalid input", e);
        }
    }
}

Node.js: Safe JSON Parsing with Prototype Pollution Protection

// Input:  Untrusted string from HTTP request body
// Output: Validated JavaScript object

function safeDeserialize(raw) {
  let parsed;
  try {
    parsed = JSON.parse(raw);
  } catch (e) {
    throw new Error('Invalid JSON input');
  }
  // Protect against prototype pollution
  if (parsed.__proto__ || parsed.constructor) {
    delete parsed.__proto__;
    delete parsed.constructor;
  }
  return parsed;
}

// UNSAFE -- NEVER use with untrusted data:
// require('node-serialize').unserialize(raw)  -- IIFE RCE
// eval('(' + raw + ')')                       -- direct RCE

Anti-Patterns

Wrong: Using pickle to deserialize user-uploaded files

# BAD -- pickle executes arbitrary code during deserialization
import pickle

def load_user_config(uploaded_file):
    return pickle.load(uploaded_file)
# Attacker crafts payload with __reduce__ running os.system("rm -rf /")

Correct: Use JSON for user data exchange

# GOOD -- JSON is data-only, no code execution possible
import json

def load_user_config(uploaded_file):
    return json.load(uploaded_file)

Wrong: Java ObjectInputStream without class filtering

// BAD -- any class on the classpath can be instantiated
ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
Command cmd = (Command) ois.readObject();
// ysoserial gadget chains execute before the cast is checked

Correct: Java ObjectInputFilter with strict allowlist

// GOOD -- only explicitly allowed classes can deserialize
ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
ois.setObjectInputFilter(ObjectInputFilter.Config.createFilter(
    "com.myapp.dto.Command;!*"
));
Command cmd = (Command) ois.readObject();

Wrong: PHP unserialize on cookie data

// BAD -- user controls cookie content, enabling object injection
$preferences = unserialize($_COOKIE['prefs']);
// Attacker sets cookie to exploit __wakeup() or __destruct()

Correct: PHP json_decode for cookie data

// GOOD -- JSON cannot instantiate PHP objects
$preferences = json_decode($_COOKIE['prefs'], true);
if ($preferences === null && json_last_error() !== JSON_ERROR_NONE) {
    $preferences = [];
}

Wrong: .NET BinaryFormatter for session state

// BAD -- BinaryFormatter is inherently unsafe, removed in .NET 9
BinaryFormatter formatter = new BinaryFormatter();
object session = formatter.Deserialize(stream);
// Microsoft: "equivalent to launching an executable"

Correct: .NET System.Text.Json for data exchange

// GOOD -- System.Text.Json deserializes to known types only
using System.Text.Json;
var session = JsonSerializer.Deserialize<SessionData>(stream);

Wrong: Ruby Marshal.load on user input

# BAD -- Marshal can instantiate any Ruby class
data = Marshal.load(Base64.decode64(params[:data]))
# Universal RCE gadget chain exists for Ruby 2.x+

Correct: Ruby JSON.parse for user data

# GOOD -- JSON produces only primitive types
require 'json'
data = JSON.parse(params[:data])

Common Pitfalls

Trusting internal services blindly: Deserialization between microservices is still dangerous if any service is compromised. Fix: Use JSON/Protobuf even for internal communication; apply zero-trust principles. [src3]
Denylist-based class filtering: Blocking known gadget classes (e.g., InvokerTransformer) fails when new chains are discovered. Fix: Use strict allowlisting -- permit only the exact classes you need. [src4]
Signing without encryption: HMAC prevents tampering but if the serialized format is native (pickle, Java serialization), a compromised key means full RCE. Fix: Sign JSON data, not native serialized objects. [src1]
YAML safe_load misconception: yaml.safe_load() is safe, but yaml.load(data, Loader=yaml.Loader) is not. Fix: Always use yaml.safe_load() or yaml.CSafeLoader. [src1]
JSON.Net TypeNameHandling: Setting TypeNameHandling to All, Auto, Objects, or Arrays enables type confusion attacks. Fix: Keep TypeNameHandling.None (the default). [src5]
Django PickleSerializer for sessions: Deprecated in Django 4.1; allows RCE if session data is attacker-controlled. Fix: Use JSONSerializer (default since Django 1.6). [src1]
Node.js eval-based parsing: Using eval('(' + data + ')') to parse "JSON-like" strings enables arbitrary code execution. Fix: Always use JSON.parse(). [src6]
Deserializing ML model files (pickle): Loading .pkl model files from untrusted sources is a major supply chain risk. Fix: Use ONNX, SafeTensors, or PMML formats instead. [src2]

Diagnostic Commands

# Scan Java project for deserialization sinks
grep -rn 'ObjectInputStream\|readObject\|XMLDecoder\|XStream' --include="*.java" .

# Scan Python project for pickle and unsafe YAML
grep -rn 'pickle\.\|yaml\.load\|yaml\.unsafe_load\|jsonpickle' --include="*.py" .

# Scan PHP project for unserialize
grep -rn 'unserialize(' --include="*.php" .

# Scan .NET project for BinaryFormatter
grep -rn 'BinaryFormatter\|TypeNameHandling' --include="*.cs" .

# Scan Node.js project for dangerous patterns
grep -rn "node-serialize\|serialize-to-js\|eval(" --include="*.js" .

# Check Java dependencies for known gadget libraries
mvn dependency:tree | grep -E 'commons-collections|spring-beans|groovy'

# Run npm audit for deserialization vulnerabilities
npm audit 2>/dev/null | grep -i "deserializ\|prototype pollution"

# Test with ysoserial (Java -- penetration testing only)
java -jar ysoserial.jar CommonsCollections1 'id' | base64

Version History & Compatibility

Standard/Tool	Version	Status	Key Feature
OWASP Top 10	2021	Current	A08: Software and Data Integrity Failures
OWASP Top 10	2017	Previous	A8: Insecure Deserialization (dedicated category)
CWE-502	4.19	Current	Deserialization of Untrusted Data
.NET BinaryFormatter	.NET 9	Removed	Throws PlatformNotSupportedException
.NET BinaryFormatter	.NET 7-8	Obsolete	Warning on use; opt-in still available
Java ObjectInputFilter	Java 9+	Current	Built-in class allowlist/denylist
Java Module System	Java 17+	Current	Restricts reflection-based gadget chains
PyYAML	6.0+	Current	Defaults to safe_load behavior
XStream	1.4.17+	Current	Built-in allowlist via allowTypes()
Jackson	2.15+	Current	Polymorphic typing disabled by default

When to Use / When Not to Use

Use When	Don't Use When	Use Instead
Application accepts serialized objects from users (cookies, forms, APIs)	All data exchange uses JSON/XML/Protobuf exclusively	Standard input validation
Legacy system uses Java ObjectInputStream for RPC	Building a new service with modern frameworks	gRPC + Protobuf, REST + JSON
Session state stored in native serialization format	Sessions use signed JWTs or encrypted cookies	JWT/session library with JSON serializer
ML models are loaded from untrusted sources (pickle files)	Models from your own pipeline with integrity verification	SafeTensors, ONNX format

Important Caveats

Java ObjectInputFilter (Java 9+) is defense-in-depth, not a complete solution -- novel gadget chains can bypass filters if the allowlist is too broad
Python pickle has NO safe mode -- RestrictedUnpickler is trivially bypassable and should not be relied upon for security
.NET BinaryFormatter was removed in .NET 9 (Nov 2024) -- applications targeting .NET 9+ will get PlatformNotSupportedException at runtime
HMAC signing protects integrity but not confidentiality -- sensitive data in serialized payloads should also be encrypted
Even JSON is not immune to all attacks -- prototype pollution (JavaScript) and billion laughs (XML) require separate defenses
AI/ML model files in pickle format (.pkl, .pt) are a growing supply chain attack vector -- CVE-2024-37052 demonstrated RCE via malicious pickled model files