Signal Source Audit

Type: Execution Recipe Confidence: 0.85 Sources: 5 Verified: 2026-03-29

Purpose

This recipe executes a systematic audit of all available data sources that could provide intent signals for a target industry vertical. It produces a scored inventory covering regulatory databases, behavioral data sources, visual signals, and unstructured media — enabling a go/no-go decision on vertical viability. [src1, src4]

Prerequisites

Constraints

Tool Selection Decision

Which audit depth?
├── Quick assessment (3-5 days)
│   └── PATH A: Desktop research only
├── Standard audit (5-10 days)
│   └── PATH B: Desktop + API testing
├── Deep audit (10-15 days)
│   └── PATH C: Desktop + API testing + vendor interviews
└── Competitive audit
    └── PATH D: Standard + competitor signal analysis
PathScopeCostSpeedConfidence
A: QuickSurface-level identification$2K-$3K3-5 daysModerate
B: StandardIdentification + quality verification$3K-$5K5-10 daysHigh
C: DeepFull evaluation + vendor negotiation$5K-$8K10-15 daysVery high
D: CompetitiveStandard + competitor analysis$4K-$7K7-12 daysHigh

Execution Flow

Step 1: Inventory Regulatory Databases

Duration: 1-2 days · Tool: Web research + government database directories

Identify all regulatory and government databases relevant to the target vertical: EPA, FDA, OSHA, SEC, state licensing boards, building permits, zoning databases. Document agency, URL, data format, update frequency, geographic coverage. Score each on accessibility (1-5), cost (1-5), refresh rate (1-5), signal-to-noise (1-5). [src1]

Verify: Minimum 5 regulatory sources identified and scored. · If failed: Vertical is lightly regulated — shift weight to behavioral sources.

Step 2: Map Behavioral Data Sources

Duration: 1-2 days · Tool: Web research + API documentation review

Identify behavioral sources: DNS/WHOIS changes, job board postings, review site activity, app store data, patent filings, press releases, conference speaker lists. Assess accessibility, cost, refresh rate, signal-to-noise for each. [src2, src5]

Verify: Minimum 5 behavioral sources, at least 2 with API access confirmed. · If failed: Vertical may lack digital footprint for automation.

Step 3: Assess Visual Signal Availability

Duration: 0.5-1 day · Tool: Satellite/street imagery platform evaluation

Evaluate visual signals: satellite imagery, street-level imagery, aerial photography. Highly vertical-dependent — skip for purely digital verticals. Note: visual processing requires specialized ML models ($2K-$10K development). [src4]

Verify: Visual relevance determined or documented as “not applicable.” · If failed: Visual signals are optional — continue.

Step 4: Identify Unstructured Media Sources

Duration: 1-2 days · Tool: Media monitoring platform evaluation

Identify text and media sources: industry publications, trade journals, conference proceedings, podcast transcripts, social media, forums. Assess volume, relevance density, extraction difficulty, timeliness. [src1, src2]

Verify: Minimum 5 unstructured sources, at least 2 text-based. · If failed: Budget additional transcription costs for audio/video sources.

Step 5: Score and Rank All Sources

Duration: 1 day · Tool: Spreadsheet + scoring framework

Compile into single scored inventory. Composite = (Accessibility × 0.30) + (Cost × 0.20) + (Refresh Rate × 0.25) + (SNR × 0.25). Plot on 2×2 priority matrix. Calculate overall viability score. [src5]

Verify: All sources scored. Priority matrix generated. Viability score calculated. · If failed: If viability < 0.60, recommend pivot.

Step 6: Deliver Audit Report

Duration: 0.5-1 day · Tool: Document generation

Produce report: executive summary, source inventory, priority matrix, cost projection, risk assessment, go/no-go recommendation.

Verify: Report reviewed, recommendation clearly stated. · If failed: Request domain expert input before finalizing.

Output Schema

{
  "output_type": "signal_source_audit",
  "format": "spreadsheet + document",
  "sections": [
    {"name": "source_inventory", "type": "array", "description": "All sources with 4-dimension scoring"},
    {"name": "priority_matrix", "type": "object", "description": "2x2 quality vs accessibility"},
    {"name": "viability_score", "type": "number", "description": "Overall vertical viability 0.0-1.0"},
    {"name": "cost_projection", "type": "object", "description": "Monthly cost for top 10 sources"},
    {"name": "risk_assessment", "type": "array", "description": "Legal, reliability, dependency risks"},
    {"name": "recommendation", "type": "string", "description": "Go/no-go with rationale"}
  ]
}

Quality Benchmarks

Quality MetricMinimum AcceptableGoodExcellent
Total sources identified> 15> 25> 40
Sources with API access> 3> 8> 15
Signal categories covered3 of 44 of 44 of 4 + niche
Cost accuracy (vs actual)Within 50%Within 25%Within 10%
Refresh rate verified (3 cycles)> 50%> 75%> 90%

If below minimum: Extend audit 2-3 days or consider vertical lacks signal density.

Error Handling

ErrorLikely CauseRecovery Action
No regulatory databases foundLightly regulated verticalShift weight to behavioral/media sources
API access denied during testingRate limits or auth requiredContact vendor for eval access; estimate from docs
Inconsistent refresh rateIrregular publication scheduleUse minimum observed frequency; flag reliability risk
Cost info unavailableEnterprise pricing, requires sales callUse comparable source pricing as estimate
Fewer than 15 sources totalLimited digital footprintRecommend paid supplements or vertical pivot

Cost Breakdown

ComponentQuick ($2K-$3K)Standard ($3K-$5K)Deep ($5K-$8K)
Regulatory inventory$500-$800$800-$1.2K$1.2K-$2K
Behavioral mapping$500-$800$800-$1.2K$1.2K-$2K
Visual + unstructured$300-$500$500-$800$800-$1.2K
Scoring + ranking$300-$500$500-$800$800-$1.2K
Report$400$400-$800$800-$1.5K
Total$2K-$3K$3K-$5K$5K-$8K

Anti-Patterns

Wrong: Counting sources without scoring them

Listing 30 sources without quality assessment. Result: pipeline built on unreliable sources fails in month one. [src1]

Correct: Score every source on all 4 dimensions

Each source gets accessibility, cost, refresh rate, and signal-to-noise ratings. Composite scores drive prioritization.

Wrong: Ignoring legal accessibility constraints

Identifying sources requiring TOS-violating scraping. Result: cease-and-desist letters mid-engagement. [src3]

Correct: Verify legal access for every source

Confirm public access, official API, or commercial licensing for each source. Document access method and legal basis.

Wrong: Single-snapshot refresh rate assessment

Checking a source once and assuming consistent updates. Result: pipeline depends on irregularly updated source. [src4]

Correct: Verify 3 consecutive update cycles

Monitor top-priority sources across at least 3 update cycles before committing pipeline dependency.

When This Matters

Use when evaluating whether a target vertical has sufficient signal density for automated intelligence. This is Phase 1 of the Signal Stack engagement — its output drives the go/no-go decision for taxonomy design and pipeline construction.

Related Units