z.string().email().max(254) (Zod/TS), EmailStr + Field(max_length=254) (Pydantic/Python), @Email @Size(max=254) (Jakarta/Java), validate:"required,email,max=254" (Go validator).| # | Validation Pattern | Description | When to Use | Risk if Skipped |
|---|---|---|---|---|
| 1 | Allowlist (accept known good) | Define exactly what is valid; reject everything else | All discrete inputs (enums, choices, known formats) | Injection attacks bypass incomplete checks |
| 2 | Type coercion | Convert input to expected type (int, float, bool, date) | Numeric fields, dates, booleans | Type confusion, integer overflow, NaN propagation |
| 3 | Length limits | Enforce min/max length on strings, arrays, file sizes | All string and collection inputs | Buffer overflow, DoS via oversized payloads, ReDoS |
| 4 | Range checks | Validate numeric values fall within expected bounds | Prices, quantities, ages, coordinates | Negative quantities, overflow, business logic abuse |
| 5 | Regex patterns | Match input against format patterns (anchored: ^...$) | Emails, phone numbers, postal codes, IDs | Malformed data, injection via unexpected characters |
| 6 | Encoding/normalization | Canonicalize Unicode, decode URL-encoding before validation | All text inputs, especially multi-byte | Double-encoding bypasses, homoglyph attacks |
| 7 | Schema validation | Validate entire request structure (JSON Schema, Zod, Pydantic) | API payloads, complex nested objects | Missing fields, extra fields, type mismatches |
| 8 | Semantic validation | Cross-field consistency (start < end, total = sum of parts) | Business logic, date ranges, financial data | Logic bugs, data corruption, fraud |
| 9 | Sanitization | Strip or encode dangerous characters AFTER validation | Rich text, HTML inputs (as defense-in-depth) | XSS, injection if validation alone is insufficient |
| 10 | File validation | Check MIME type, magic bytes, size, extension, filename | File uploads | Arbitrary code execution, path traversal |
| Language | Primary Library | Schema Example | Key Feature |
|---|---|---|---|
| Python | Pydantic v2 | class User(BaseModel): email: EmailStr | Type-safe, fast Rust core, auto-coercion |
| TypeScript | Zod 3.x | z.object({ email: z.string().email() }) | Static type inference from schema |
| Node.js | Joi 17.x | Joi.object({ email: Joi.string().email() }) | Fluent API, detailed error messages |
| Java | Jakarta Validation 3.1 | @Email @NotBlank String email; | Annotation-driven, framework-integrated |
| Go | validator v10 | Email string `validate:"required,email"` | Struct tag-based, Gin integration |
| C# | FluentValidation | RuleFor(x => x.Email).NotEmpty().EmailAddress() | LINQ-like, testable rules |
START: What kind of input are you validating?
├── Discrete/enumerated value (country, status, category)?
│ ├── YES → Allowlist: check against exact set of valid values
│ └── NO ↓
├── Structured data type (email, URL, phone, date, UUID)?
│ ├── YES → Use library-provided validators + length limits + semantic checks
│ └── NO ↓
├── Numeric value (price, quantity, age)?
│ ├── YES → Type coerce to number + range check (min/max) + reject NaN/Infinity
│ └── NO ↓
├── Free-form text (name, comment, description)?
│ ├── YES → Length limit + Unicode normalization + encoding on OUTPUT
│ └── NO ↓
├── File upload?
│ ├── YES → Validate extension + MIME type + magic bytes + size limit + rename
│ └── NO ↓
├── Complex nested object (API payload)?
│ ├── YES → Schema validation (Pydantic, Zod, JSON Schema) + semantic checks
│ └── NO ↓
└── DEFAULT → Type coerce + length limit + allowlist characters + server-side only
Define validation schemas at the point where data enters your application (API endpoints, form handlers, CLI parsers). Never validate deep inside business logic. [src1]
# Python: Pydantic v2 -- define schema at API boundary
from pydantic import BaseModel, Field, EmailStr, field_validator
class CreateUserRequest(BaseModel):
email: EmailStr
name: str = Field(min_length=1, max_length=100)
age: int = Field(ge=13, le=150)
role: str = Field(pattern=r'^(admin|user|viewer)$')
Verify: CreateUserRequest(email="bad", name="", age=5, role="hacker") raises ValidationError with specific field errors.
For any input that should be one of a known set of values, validate against an explicit allowlist. Never use blocklists for enumerated data. [src1]
// TypeScript: Zod -- allowlist via enum
import { z } from 'zod'; // ^3.22.0
const RoleSchema = z.enum(['admin', 'user', 'viewer']);
const CreateUserSchema = z.object({
email: z.string().email().max(254),
name: z.string().min(1).max(100).trim(),
role: RoleSchema,
}).strict(); // Reject unknown keys
Verify: CreateUserSchema.safeParse({ role: 'superadmin' }) returns { success: false }.
Convert inputs to expected types early. Reject values that cannot be cleanly coerced. [src2]
// Go: validator v10 -- struct tag validation
type CreateUserRequest struct {
Email string `json:"email" validate:"required,email,max=254"`
Name string `json:"name" validate:"required,min=1,max=100"`
Age int `json:"age" validate:"required,gte=13,lte=150"`
Role string `json:"role" validate:"required,oneof=admin user viewer"`
}
Verify: validate.Struct(req) with invalid fields returns validator.ValidationErrors.
After syntactic validation passes, check business rules: cross-field consistency, temporal logic, and domain constraints. [src7]
# Pydantic -- semantic (cross-field) validation
from pydantic import model_validator
class BookingRequest(BaseModel):
check_in: date
check_out: date
@model_validator(mode='after')
def validate_dates(self):
if self.check_out <= self.check_in:
raise ValueError('check_out must be after check_in')
return self
Verify: Reversed dates raise ValidationError.
Decode and normalize input before applying validation rules to prevent double-encoding attacks. [src2]
import unicodedata
def canonicalize(value: str) -> str:
normalized = unicodedata.normalize('NFC', value)
cleaned = ''.join(
c for c in normalized
if unicodedata.category(c) != 'Cc' or c in ('\n', '\t')
)
return cleaned.strip()
Verify: Control characters like \x00 are stripped; decomposed Unicode is composed.
# Input: Raw JSON request body from HTTP POST
# Output: Validated, typed Python object or ValidationError
from pydantic import BaseModel, Field, EmailStr, field_validator
from pydantic import ConfigDict
from enum import Enum
class UserRole(str, Enum):
admin = "admin"
user = "user"
viewer = "viewer"
class CreateUserRequest(BaseModel):
model_config = ConfigDict(str_strip_whitespace=True)
email: EmailStr
name: str = Field(min_length=1, max_length=100)
age: int = Field(ge=13, le=150)
role: UserRole = UserRole.user
bio: str | None = Field(default=None, max_length=500)
// Input: Unknown data from request body or form submission
// Output: Typed object or ZodError with field-level details
import { z } from 'zod'; // ^3.22.0
const CreateUserSchema = z.object({
email: z.string().email().max(254).toLowerCase(),
name: z.string().min(1).max(100).trim(),
age: z.coerce.number().int().min(13).max(150),
role: z.enum(['admin', 'user', 'viewer']).default('user'),
}).strict();
type CreateUser = z.infer<typeof CreateUserSchema>;
// Input: Request DTO from Spring MVC / JAX-RS
// Output: Validated bean or ConstraintViolationException
import jakarta.validation.constraints.*;
public record CreateUserRequest(
@NotBlank @Email @Size(max = 254)
String email,
@NotBlank @Size(min = 1, max = 100)
@Pattern(regexp = "^[\\p{L} '-]+$")
String name,
@NotNull @Min(13) @Max(150)
Integer age,
@NotNull @Pattern(regexp = "^(admin|user|viewer)$")
String role
) {}
// Input: JSON-decoded struct from HTTP request
// Output: nil (valid) or validator.ValidationErrors
type CreateUserRequest struct {
Email string `json:"email" validate:"required,email,max=254"`
Name string `json:"name" validate:"required,min=1,max=100"`
Age int `json:"age" validate:"required,gte=13,lte=150"`
Role string `json:"role" validate:"required,oneof=admin user viewer"`
Bio string `json:"bio" validate:"omitempty,max=500"`
}
var validate = validator.New()
func ValidateUser(req *CreateUserRequest) error {
req.Email = strings.TrimSpace(strings.ToLower(req.Email))
req.Name = strings.TrimSpace(req.Name)
return validate.Struct(req)
}
# BAD -- blocklist filtering is trivially bypassed
def validate_input(value):
dangerous = ['<script>', 'DROP TABLE', 'eval(', '../']
for d in dangerous:
if d.lower() in value.lower():
raise ValueError('Dangerous input detected')
return value
# Bypassed by: <scr<script>ipt>, DR/**/OP TABLE, e\x76al(, ..%2f
# GOOD -- define what is valid, reject everything else
from pydantic import BaseModel, Field
from enum import Enum
class Status(str, Enum):
active = "active"
inactive = "inactive"
class UpdateRequest(BaseModel):
status: Status
count: int = Field(ge=0, le=1000)
name: str = Field(max_length=100)
// BAD -- client-side validation provides zero security
function submitForm() {
const email = document.getElementById('email').value;
if (!email.includes('@')) { alert('Invalid email'); return; }
// Attacker bypasses with: curl -X POST -d 'email=<script>alert(1)</script>'
fetch('/api/users', { method: 'POST', body: JSON.stringify({ email }) });
}
// GOOD -- server-side validation is authoritative
import { z } from 'zod';
const EmailSchema = z.string().email().max(254);
app.post('/api/users', (req, res) => {
const result = EmailSchema.safeParse(req.body.email);
if (!result.success) {
return res.status(400).json({ error: result.error.flatten() });
}
});
# BAD -- regex without length limit enables ReDoS
import re
email_re = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')
def validate_email(email):
if email_re.match(email): # No length check!
return True # 10MB string causes catastrophic backtracking
return False
# GOOD -- check length BEFORE applying regex
def validate_email(email: str) -> bool:
if not email or len(email) > 254: # RFC 5321 limit
return False
if len(email.split('@')[0]) > 64: # Local part limit
return False
return bool(re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', email))
# BAD -- deserializing untrusted input without validation
import json
data = json.loads(request.body)
user_id = data['user_id'] # No type check
quantity = data['quantity'] # No range check
db.execute(f"UPDATE orders SET qty={quantity} WHERE user={user_id}")
# GOOD -- validate schema, then use parameterized queries
from pydantic import BaseModel, Field
class UpdateOrderRequest(BaseModel):
user_id: int = Field(gt=0)
quantity: int = Field(ge=1, le=9999)
req = UpdateOrderRequest.model_validate_json(request.body)
db.execute("UPDATE orders SET qty=%s WHERE user=%s", (req.quantity, req.user_id))
unicodedata.normalize('NFC', input) before validation. [src1]@validator replaced by @field_validator; class Config replaced by model_config. Fix: Follow the Pydantic v2 migration guide; use bump-pydantic tool. [src3].parse() throws exceptions on invalid input, crashing Express if uncaught. Fix: Use .safeParse() in request handlers and check result.success. [src4]Pydantic strict=True, z.number() without z.coerce). [src3]max_length on list/array fields. [src2]# Test Pydantic validation in Python REPL
python -c "
from pydantic import BaseModel, EmailStr, Field
class T(BaseModel):
email: EmailStr
age: int = Field(ge=0, le=150)
try: T(email='bad', age=-1)
except Exception as e: print(e)
"
# Test Zod validation in Node.js
node -e "
const {z} = require('zod');
const s = z.object({email: z.string().email(), age: z.number().int().min(0)});
console.log(s.safeParse({email:'bad', age:-1}));
"
# Find unvalidated request body usage (Node.js/Express)
grep -rn 'req\.body\.' --include="*.js" --include="*.ts" . | grep -v 'validate\|schema\|parse'
# Find raw SQL string interpolation (Python)
grep -rn 'f".*SELECT\|f".*INSERT\|f".*UPDATE\|f".*DELETE' --include="*.py" .
# Audit Java controllers for missing @Valid annotation
grep -rn '@RequestBody' --include="*.java" . | grep -v '@Valid'
| Library | Version | Status | Key Change |
|---|---|---|---|
| Pydantic | v2.x | Current | Rust-powered core, @field_validator, 5-50x faster |
| Pydantic | v1.x | EOL (2024) | @validator, class Config -- use bump-pydantic to migrate |
| Zod | 3.x | Current/Stable | z.coerce, .pipe(), .brand(), discriminated unions |
| Joi | 17.x | Current | ESM support, TypeScript types |
| Joi | 16.x | Deprecated | Package was @hapi/joi |
| Jakarta Validation | 3.1 | Current | jakarta.validation.* namespace |
| Jakarta Validation | 2.0 | Legacy | javax.validation.* -- requires namespace migration |
| go-playground/validator | v10 | Current | Custom validators, struct-level validation, dive support |
| FluentValidation (.NET) | 11.x | Current | .NET 8 support, async validators |
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Any user-supplied data enters your application | Processing fully trusted internal system data | Basic type assertions may suffice |
| Building APIs that accept JSON/form payloads | Validating output for display | Output encoding (context-specific escaping) |
| File upload processing | Simple static config file parsing | Config libraries with built-in schema |
| CLI tools accepting user arguments | Data already validated by upstream service in same trust zone | Pass validated types between services |
| Preventing injection attacks at the boundary | Replacing parameterized queries for SQL | Parameterized queries + input validation together |