Lead Enrichment Pipeline
Purpose
This recipe takes a raw lead list and enriches it through a multi-provider waterfall — adding verified work emails, direct phone numbers, firmographic data, and technology stack information. Output is a fully enriched lead database with per-lead cost tracking.
Prerequisites
- Raw lead list (CSV) — minimum columns: first_name, last_name, company
- Hunter.io API key — hunter.io/api (free: 25 searches/mo)
- Lusha API key (phone) — lusha.com
- Python 3.10+ with requests and pandas
- Clearbit API key (optional) — clearbit.com
Constraints
- Hunter.io free: 25 searches/mo, 50 verifications/mo. Starter $49/mo = 500 searches. [src1]
- Lusha free: 5 credits/mo. Pro $49/mo = 40 credits. [src3]
- Clearbit: no free tier, starts $99/mo. $0.05-0.20/record at scale. [src4]
- Waterfall order: cheapest provider first, then fallback. [src2]
- GDPR: Enriching EU resident data requires legitimate interest basis.
Tool Selection Decision
| Path | Tools | Cost | Email Hit Rate | Phone Hit Rate |
|---|---|---|---|---|
| A: Email Only | Hunter.io Free | $0 | 50-65% | N/A |
| B: Email + Phone | Hunter + Lusha | $98/mo | 65-80% | 25-40% |
| C: Full Stack | Hunter + Lusha + Clearbit | $247/mo | 75-85% | 30-45% |
| D: Maximum | Clay or Apollo Pro | $149-500/mo | 85-95% | 40-60% |
Execution Flow
Step 1: Prepare Input and Waterfall Config
Duration: 5 min | Tool: Python
Load raw CSV, configure waterfall providers ordered by cost (cheapest first), initialize enrichment log.
Step 2: Email Enrichment (Hunter.io)
Duration: 10-30 min/500 leads | Tool: Hunter.io API
Use email-finder endpoint with first name, last name, and company domain. Store confidence score. Rate limit: 10 req/sec paid, 1/sec free. [src1]
Step 3: Phone Enrichment (Lusha)
Duration: 10-20 min | Tool: Lusha API
Only enrich leads that already have verified email (prioritize hot leads). Returns direct phone with type classification. [src3]
Step 4: Firmographic Enrichment (Clearbit)
Duration: 10-20 min | Tool: Clearbit Company API
Enrich unique company domains (not per lead) with employee count, revenue range, industry, tech stack, and founding year. [src4]
Step 5: Generate Enrichment Audit and Export
Duration: 5-10 min
Calculate email/phone/firmographic coverage rates, total cost, cost per enriched lead. Export enriched CSV and JSON audit log.
Output Schema
CSV: first_name, last_name, company, email, email_confidence, email_provider, phone, company_size, industry, revenue_range. 200-5,000 rows, sorted by email_confidence descending, deduplicated on email.
Quality Benchmarks
| Metric | Minimum | Good | Excellent |
|---|---|---|---|
| Email coverage | > 50% | > 70% | > 85% |
| Email confidence avg | > 70 | > 80 | > 90 |
| Phone coverage | > 15% | > 30% | > 45% |
| Firmographic coverage | > 40% | > 65% | > 80% |
| Cost per enriched lead | < $1.00 | < $0.50 | < $0.20 |
Error Handling
| Error | Cause | Recovery |
|---|---|---|
| Hunter 429 | Rate limit exceeded | Add 1s delay between requests |
| Hunter 401 | Invalid API key | Regenerate at hunter.io/api |
| Lusha 402 | Credits exhausted | Upgrade plan or wait for reset |
| Clearbit 404 | Domain not found | Skip, mark unenriched |
| Low hit rate | Bad company domains | Enrich domains via Clearbit first |
Cost Breakdown
| Component | Free Tier | Paid Tier | At Scale |
|---|---|---|---|
| Hunter.io (email) | 25 searches/mo | $49/mo (500) | $149/mo (5K) |
| Lusha (phone) | 5 credits/mo | $49/mo (40) | $79/mo (80) |
| Clearbit (firm.) | None | $99/mo | $0.05-0.20/rec |
| BuiltWith (tech) | Manual only | $295/mo | $295/mo |
| Total/500 leads | $0 | $197/mo | $523+/mo |
Anti-Patterns
Wrong: Calling all providers for every lead simultaneously
Wastes credits and money when most leads can be enriched by the first provider. [src2]
Correct: Waterfall logic — cheapest first, fall through on miss
Try Hunter first. Only call Lusha if Hunter misses. Only use Clearbit for company data. [src2]
Wrong: Enriching before deduplication
Enriching duplicate leads doubles credit consumption. [src3]
Correct: Deduplicate raw list before any enrichment calls
Remove duplicate name+company pairs first, then enrich the clean list.
When This Matters
Use when the agent has a raw lead list that needs work emails, phone numbers, and firmographic data. The waterfall approach maximizes coverage while minimizing per-lead cost.