Insecure Deserialization Prevention Guide
How do I prevent insecure deserialization vulnerabilities?
TL;DR
- Bottom line: Never deserialize untrusted data using native serialization formats -- use data-only formats like JSON instead, and apply allowlist class filtering as defense-in-depth when native deserialization is unavoidable.
- Key tool/command: Replace
ObjectInputStream(Java),pickle.loads()(Python),unserialize()(PHP),BinaryFormatter(.NET),Marshal.load(Ruby) with JSON-based alternatives. - Watch out for: Validating data after deserialization is too late -- malicious code executes during the deserialization process itself, not after.
- Works with: All languages and frameworks. CWE-502 / OWASP A08:2021 (Software and Data Integrity Failures).
Constraints
- NEVER deserialize data from untrusted sources using native serialization formats (Java ObjectInputStream, Python pickle, PHP unserialize, .NET BinaryFormatter, Ruby Marshal)
- Allowlist-based class filtering is defense-in-depth only -- new gadget chains are discovered regularly and bypass denylists
- NEVER use eval(), new Function(), or node-serialize on user-supplied input in any language
- Signing/HMAC on serialized data prevents tampering but does NOT prevent exploitation if the signing key is compromised
- Input validation AFTER deserialization is too late -- the attack executes during deserialization, not after
Quick Reference
Deserialization Threat/Fix Checklist by Language
| # | Language | Vulnerable API | Risk | Safe Alternative | Notes |
|---|---|---|---|---|---|
| 1 | Java | ObjectInputStream.readObject() | Critical -- RCE via gadget chains (ysoserial) | JSON (Jackson/Gson), XML (JAXB), Protobuf | Override resolveClass() for allowlisting if unavoidable |
| 2 | Java | XMLDecoder with user input | Critical -- arbitrary method invocation | JAXB, StAX, DOM parsers | Never use XMLDecoder with untrusted data |
| 3 | Java | XStream < 1.4.17 | Critical -- RCE via type manipulation | XStream >= 1.4.17 with allowlist, or Jackson | Set XStream.allowTypes() explicitly |
| 4 | Python | pickle.loads() / pickle.load() | Critical -- arbitrary code via __reduce__ | json.loads(), msgpack, protobuf | No safe way to use pickle with untrusted data |
| 5 | Python | yaml.load() (PyYAML) | Critical -- code execution via !!python/object | yaml.safe_load() | PyYAML >= 6.0 defaults to safe_load |
| 6 | PHP | unserialize() | Critical -- object injection via magic methods | json_decode() / json_encode() | POP chains exploit __wakeup, __destruct |
| 7 | .NET | BinaryFormatter.Deserialize() | Critical -- RCE, cannot be secured | System.Text.Json, DataContractSerializer | Removed entirely in .NET 9 |
| 8 | .NET | TypeNameHandling != None (JSON.Net) | High -- type confusion RCE | Set TypeNameHandling.None | Default is None; never change for untrusted data |
| 9 | Ruby | Marshal.load() | Critical -- arbitrary object instantiation | JSON.parse(), Oj.safe_load | No safe configuration exists |
| 10 | Node.js | node-serialize / serialize-to-js | Critical -- RCE via IIFE injection | JSON.parse() / JSON.stringify() | Avoid any library that serializes functions |
| 11 | Any | eval() on serialized strings | Critical -- direct code execution | Language-specific safe parsers | Never eval user input in any context |
Decision Tree
START: Does your application deserialize data from untrusted sources?
├── NO → No insecure deserialization risk (verify trust boundaries)
├── YES
├── Can you switch to a data-only format (JSON, XML, Protobuf)?
│ ├── YES → Replace native serialization with JSON/Protobuf (BEST option)
│ └── NO
│ ├── Java? → Override resolveClass() with allowlist + use ObjectInputFilter (Java 9+)
│ ├── Python? → Replace pickle with json.loads(); use yaml.safe_load() for YAML
│ ├── .NET? → Migrate to System.Text.Json; BinaryFormatter removed in .NET 9
│ ├── PHP? → Replace unserialize() with json_decode()
│ ├── Ruby? → Replace Marshal.load with JSON.parse; use Oj.safe_load
│ └── Node.js? → Use JSON.parse/JSON.stringify only; remove node-serialize
└── Add defense-in-depth: HMAC signing, integrity checks, monitoring
Step-by-Step Guide
1. Audit your codebase for deserialization sinks
Identify every location where untrusted data is deserialized. Use static analysis or grep for known dangerous patterns. [src2]
# Java: find ObjectInputStream usage
grep -rn 'ObjectInputStream\|readObject\|XMLDecoder\|XStream' --include="*.java" .
# Python: find pickle and unsafe YAML
grep -rn 'pickle\.load\|pickle\.loads\|yaml\.load\|jsonpickle' --include="*.py" .
# PHP: find unserialize calls
grep -rn 'unserialize(' --include="*.php" .
# .NET: find BinaryFormatter and unsafe TypeNameHandling
grep -rn 'BinaryFormatter\|TypeNameHandling' --include="*.cs" .
# Node.js: find node-serialize and eval-based deserialization
grep -rn "node-serialize\|serialize-to-js\|eval(" --include="*.js" .
Verify: Count all findings -- each one is a potential RCE vulnerability.
2. Replace native serialization with JSON
Switch from native object serialization to data-only formats. JSON cannot represent executable code, eliminating the entire attack class. [src1]
# Python: BEFORE (vulnerable)
import pickle
data = pickle.loads(user_input) # RCE vulnerability
# Python: AFTER (safe)
import json
data = json.loads(user_input) # Data only, no code execution
Verify: Run audit commands again -- dangerous patterns should be eliminated.
3. Apply allowlist filtering when native deserialization is unavoidable
Restrict which classes can be deserialized using built-in filtering mechanisms. [src1]
// Java 9+: ObjectInputFilter (built-in)
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
"com.myapp.dto.*;!*" // Allow only com.myapp.dto, deny all else
);
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);
Verify: Attempt to deserialize a non-allowlisted class -- should throw InvalidClassException.
4. Sign serialized data with HMAC
When you must serialize data that will be stored or transmitted, sign it to detect tampering. [src3]
import hmac, hashlib, json
SECRET_KEY = b'your-secret-key-from-env'
def serialize_signed(data: dict) -> str:
payload = json.dumps(data, sort_keys=True)
sig = hmac.new(SECRET_KEY, payload.encode(), hashlib.sha256).hexdigest()
return f"{payload}|{sig}"
def deserialize_verified(signed_data: str) -> dict:
payload, sig = signed_data.rsplit('|', 1)
expected = hmac.new(SECRET_KEY, payload.encode(), hashlib.sha256).hexdigest()
if not hmac.compare_digest(sig, expected):
raise ValueError("Tampered data detected")
return json.loads(payload)
Verify: Modify any byte in the signed payload -- should raise ValueError.
5. Monitor and log deserialization events
Add observability to detect exploitation attempts in production. [src2]
ObjectInputFilter loggingFilter = filterInfo -> {
Class<?> clazz = filterInfo.serialClass();
if (clazz != null) {
logger.info("Deserialization attempt: {}", clazz.getName());
if (!ALLOWED_CLASSES.contains(clazz.getName())) {
logger.warn("BLOCKED: {}", clazz.getName());
return ObjectInputFilter.Status.REJECTED;
}
}
return ObjectInputFilter.Status.ALLOWED;
};
Verify: Check logs for BLOCKED entries after deploying the filter.
Code Examples
Python: Safe Data Exchange with JSON
# Input: Untrusted user data (API request body, file upload)
# Output: Validated Python dict
import json
from dataclasses import dataclass
def deserialize_user_data(raw: str) -> dict:
try:
data = json.loads(raw) # Safe -- data only
except json.JSONDecodeError:
raise ValueError("Invalid JSON input")
if not isinstance(data, dict):
raise ValueError("Expected JSON object")
return data
# UNSAFE -- NEVER use with untrusted data:
# pickle.loads(raw) -- arbitrary code execution
# yaml.load(raw) -- code exec via !!python/object
# jsonpickle.decode(raw) -- code exec via py/object
Java: Jackson with Strict Type Binding
// Input: Untrusted JSON string from HTTP request
// Output: Validated DTO object
import com.fasterxml.jackson.databind.ObjectMapper; // 2.15+
import com.fasterxml.jackson.databind.DeserializationFeature;
public class SafeDeserializer {
private static final ObjectMapper mapper = new ObjectMapper()
.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
// CRITICAL: never enable default typing
// mapper.enableDefaultTyping() <-- NEVER DO THIS
public static <T> T deserialize(String json, Class<T> type) {
try {
return mapper.readValue(json, type);
} catch (Exception e) {
throw new IllegalArgumentException("Invalid input", e);
}
}
}
Node.js: Safe JSON Parsing with Prototype Pollution Protection
// Input: Untrusted string from HTTP request body
// Output: Validated JavaScript object
function safeDeserialize(raw) {
let parsed;
try {
parsed = JSON.parse(raw);
} catch (e) {
throw new Error('Invalid JSON input');
}
// Protect against prototype pollution
if (parsed.__proto__ || parsed.constructor) {
delete parsed.__proto__;
delete parsed.constructor;
}
return parsed;
}
// UNSAFE -- NEVER use with untrusted data:
// require('node-serialize').unserialize(raw) -- IIFE RCE
// eval('(' + raw + ')') -- direct RCE
Anti-Patterns
Wrong: Using pickle to deserialize user-uploaded files
# BAD -- pickle executes arbitrary code during deserialization
import pickle
def load_user_config(uploaded_file):
return pickle.load(uploaded_file)
# Attacker crafts payload with __reduce__ running os.system("rm -rf /")
Correct: Use JSON for user data exchange
# GOOD -- JSON is data-only, no code execution possible
import json
def load_user_config(uploaded_file):
return json.load(uploaded_file)
Wrong: Java ObjectInputStream without class filtering
// BAD -- any class on the classpath can be instantiated
ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
Command cmd = (Command) ois.readObject();
// ysoserial gadget chains execute before the cast is checked
Correct: Java ObjectInputFilter with strict allowlist
// GOOD -- only explicitly allowed classes can deserialize
ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
ois.setObjectInputFilter(ObjectInputFilter.Config.createFilter(
"com.myapp.dto.Command;!*"
));
Command cmd = (Command) ois.readObject();
Wrong: PHP unserialize on cookie data
// BAD -- user controls cookie content, enabling object injection
$preferences = unserialize($_COOKIE['prefs']);
// Attacker sets cookie to exploit __wakeup() or __destruct()
Correct: PHP json_decode for cookie data
// GOOD -- JSON cannot instantiate PHP objects
$preferences = json_decode($_COOKIE['prefs'], true);
if ($preferences === null && json_last_error() !== JSON_ERROR_NONE) {
$preferences = [];
}
Wrong: .NET BinaryFormatter for session state
// BAD -- BinaryFormatter is inherently unsafe, removed in .NET 9
BinaryFormatter formatter = new BinaryFormatter();
object session = formatter.Deserialize(stream);
// Microsoft: "equivalent to launching an executable"
Correct: .NET System.Text.Json for data exchange
// GOOD -- System.Text.Json deserializes to known types only
using System.Text.Json;
var session = JsonSerializer.Deserialize<SessionData>(stream);
Wrong: Ruby Marshal.load on user input
# BAD -- Marshal can instantiate any Ruby class
data = Marshal.load(Base64.decode64(params[:data]))
# Universal RCE gadget chain exists for Ruby 2.x+
Correct: Ruby JSON.parse for user data
# GOOD -- JSON produces only primitive types
require 'json'
data = JSON.parse(params[:data])
Common Pitfalls
- Trusting internal services blindly: Deserialization between microservices is still dangerous if any service is compromised. Fix: Use JSON/Protobuf even for internal communication; apply zero-trust principles. [src3]
- Denylist-based class filtering: Blocking known gadget classes (e.g.,
InvokerTransformer) fails when new chains are discovered. Fix: Use strict allowlisting -- permit only the exact classes you need. [src4] - Signing without encryption: HMAC prevents tampering but if the serialized format is native (pickle, Java serialization), a compromised key means full RCE. Fix: Sign JSON data, not native serialized objects. [src1]
- YAML safe_load misconception:
yaml.safe_load()is safe, butyaml.load(data, Loader=yaml.Loader)is not. Fix: Always useyaml.safe_load()oryaml.CSafeLoader. [src1] - JSON.Net TypeNameHandling: Setting
TypeNameHandlingtoAll,Auto,Objects, orArraysenables type confusion attacks. Fix: KeepTypeNameHandling.None(the default). [src5] - Django PickleSerializer for sessions: Deprecated in Django 4.1; allows RCE if session data is attacker-controlled. Fix: Use
JSONSerializer(default since Django 1.6). [src1] - Node.js eval-based parsing: Using
eval('(' + data + ')')to parse "JSON-like" strings enables arbitrary code execution. Fix: Always useJSON.parse(). [src6] - Deserializing ML model files (pickle): Loading
.pklmodel files from untrusted sources is a major supply chain risk. Fix: Use ONNX, SafeTensors, or PMML formats instead. [src2]
Diagnostic Commands
# Scan Java project for deserialization sinks
grep -rn 'ObjectInputStream\|readObject\|XMLDecoder\|XStream' --include="*.java" .
# Scan Python project for pickle and unsafe YAML
grep -rn 'pickle\.\|yaml\.load\|yaml\.unsafe_load\|jsonpickle' --include="*.py" .
# Scan PHP project for unserialize
grep -rn 'unserialize(' --include="*.php" .
# Scan .NET project for BinaryFormatter
grep -rn 'BinaryFormatter\|TypeNameHandling' --include="*.cs" .
# Scan Node.js project for dangerous patterns
grep -rn "node-serialize\|serialize-to-js\|eval(" --include="*.js" .
# Check Java dependencies for known gadget libraries
mvn dependency:tree | grep -E 'commons-collections|spring-beans|groovy'
# Run npm audit for deserialization vulnerabilities
npm audit 2>/dev/null | grep -i "deserializ\|prototype pollution"
# Test with ysoserial (Java -- penetration testing only)
java -jar ysoserial.jar CommonsCollections1 'id' | base64
Version History & Compatibility
| Standard/Tool | Version | Status | Key Feature |
|---|---|---|---|
| OWASP Top 10 | 2021 | Current | A08: Software and Data Integrity Failures |
| OWASP Top 10 | 2017 | Previous | A8: Insecure Deserialization (dedicated category) |
| CWE-502 | 4.19 | Current | Deserialization of Untrusted Data |
| .NET BinaryFormatter | .NET 9 | Removed | Throws PlatformNotSupportedException |
| .NET BinaryFormatter | .NET 7-8 | Obsolete | Warning on use; opt-in still available |
| Java ObjectInputFilter | Java 9+ | Current | Built-in class allowlist/denylist |
| Java Module System | Java 17+ | Current | Restricts reflection-based gadget chains |
| PyYAML | 6.0+ | Current | Defaults to safe_load behavior |
| XStream | 1.4.17+ | Current | Built-in allowlist via allowTypes() |
| Jackson | 2.15+ | Current | Polymorphic typing disabled by default |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Application accepts serialized objects from users (cookies, forms, APIs) | All data exchange uses JSON/XML/Protobuf exclusively | Standard input validation |
| Legacy system uses Java ObjectInputStream for RPC | Building a new service with modern frameworks | gRPC + Protobuf, REST + JSON |
| Session state stored in native serialization format | Sessions use signed JWTs or encrypted cookies | JWT/session library with JSON serializer |
| ML models are loaded from untrusted sources (pickle files) | Models from your own pipeline with integrity verification | SafeTensors, ONNX format |
Important Caveats
- Java ObjectInputFilter (Java 9+) is defense-in-depth, not a complete solution -- novel gadget chains can bypass filters if the allowlist is too broad
- Python pickle has NO safe mode -- RestrictedUnpickler is trivially bypassable and should not be relied upon for security
- .NET BinaryFormatter was removed in .NET 9 (Nov 2024) -- applications targeting .NET 9+ will get PlatformNotSupportedException at runtime
- HMAC signing protects integrity but not confidentiality -- sensitive data in serialized payloads should also be encrypted
- Even JSON is not immune to all attacks -- prototype pollution (JavaScript) and billion laughs (XML) require separate defenses
- AI/ML model files in pickle format (.pkl, .pt) are a growing supply chain attack vector -- CVE-2024-37052 demonstrated RCE via malicious pickled model files