OIA Network Analysis Execution

How do you execute an ONA using Viva Insights and custom graph analysis?

Purpose

This recipe executes an organizational network analysis (ONA) from communication metadata, producing a directed weighted graph with centrality scores, bottleneck identification, structural hole analysis, and a formal-vs-informal gap report. The output reveals who actually drives information flow versus who the org chart says should. [src1, src2]

Prerequisites

Communication metadata dataset from OIA Data Collection — minimum 90 days
Org structure dataset (anonymized org chart with reporting lines)
Python 3.10+ with NetworkX, pandas, matplotlib installed
Gephi 0.10+ installed for visualization (or Viva Insights access)
Org chart overlay data — formal reporting lines mapped to anonymized IDs

Constraints

Node count must match employee roster — missing nodes produce false structural holes. [src3]
Edge weight normalization: 1:1 messages 3x, thread replies 2x, channel broadcasts 1x, reactions 0.5x. [src2]
Flag nodes > 2 standard deviations above mean centrality as bottlenecks/hubs.
Anonymized IDs only in all shared outputs.
Minimum 50 nodes for meaningful centrality analysis.

Tool Selection Decision

Which tool path?
├── Enterprise client with Microsoft 365
│   └── PATH A: Viva Insights — automated ONA
├── Technical analyst + org size < 1000
│   └── PATH B: Python NetworkX + Gephi — full control
├── Technical analyst + org size > 1000
│   └── PATH C: Python igraph + D3.js — scalable
└── Non-technical, no Viva Insights
    └── PATH D: Gephi desktop — visual-first

Path	Tools	Cost	Speed	Output Quality
A: Viva Insights	Microsoft Viva Insights	$5K-$10K	1-2 days	Excellent
B: NetworkX + Gephi	Python, NetworkX, Gephi	$0	3-4 days	Excellent
C: igraph + D3.js	Python, igraph, D3.js	$0	3-5 days	Excellent
D: Gephi desktop	Gephi, spreadsheets	$0	4-5 days	Good

Execution Flow

Step 1: Tool Selection

Duration: 1-2 hours · Tool: Decision based on client environment

Choose tool path based on Microsoft 365 license, analyst skill, org size, and visualization requirements.

Verify: All dependencies installed. · If failed: If pip fails, use conda.

Step 2: Graph Construction

Duration: 4-8 hours · Tool: Python NetworkX or igraph

Build directed weighted graph from communication metadata. Nodes = anonymized employee IDs, edges = communication frequency weighted by interaction type. [src2]

# Weight by interaction type: direct_message=3.0, thread_reply=2.0,
# channel_message=1.0, reaction=0.5, email_direct=3.0, email_cc=0.5
G = nx.DiGraph()
# Aggregate weighted edges from communication metadata

Verify: Node count matches roster (+/- 5%). · If failed: Check data collection coverage.

Step 3: Centrality Analysis

Duration: 4-8 hours · Tool: Python NetworkX

Calculate betweenness centrality (bottlenecks), degree centrality (hubs), eigenvector centrality (influence). Identify structural holes using Burt's constraint measure. Flag statistical outliers (> 2 SD). [src1, src3]

Verify: Top 10 bottlenecks identified, centrality distributions follow power-law shape. · If failed: Re-check weight normalization.

Step 4: Formal vs Informal Gap Analysis

Duration: 4-8 hours · Tool: Python + org chart data

Overlay formal org chart onto actual communication graph. Identify shadow influencers (high eigenvector, no formal authority) and bypassed managers (formal authority, low communication centrality). [src1]

Verify: Gap report generated with shadow influencer and bypassed manager lists. · If failed: Document gaps in org chart data.

Step 5: Visualization & Report

Duration: 4-8 hours · Tool: Gephi, D3.js, or matplotlib

Generate annotated network visualization. Produce written analysis with findings and recommendations. [src1, src3]

Verify: Client validates top 5 bottlenecks and shadow influencers. · If failed: Re-run with different edge weight calibration.

Output Schema

{
  "output_type": "ona_analysis_report",
  "format": "PDF + CSV + SVG",
  "key_metrics": [
    {"name": "bottleneck_count", "description": "Nodes > 2 SD above mean betweenness"},
    {"name": "shadow_influencer_count", "description": "High eigenvector, no formal authority"},
    {"name": "bypassed_manager_count", "description": "Formal authority, low centrality"},
    {"name": "structural_hole_count", "description": "Nodes spanning disconnected clusters"}
  ]
}

Quality Benchmarks

Quality Metric	Minimum Acceptable	Good	Excellent
Node coverage (% of roster)	> 80%	> 90%	> 98%
Bottleneck identification (client-validated)	> 50% confirmed	> 70%	> 85%
Visualization readability	Clusters visible	Annotated clusters	Interactive exploration
Gap analysis coverage	> 60%	> 80%	> 95%

If below minimum: Extend data collection, add missing platforms, or use interview-based network mapping.

Error Handling

Error	Likely Cause	Recovery Action
NetworkX convergence failure	Graph too sparse or disconnected	Use eigenvector_centrality_numpy or fall back to PageRank
Uniform centrality scores	Edge weights not normalized	Re-check weight calibration, verify interaction type distribution
Too many bottlenecks (> 20%)	Threshold too low	Increase to 2.5 or 3 standard deviations
Disconnected graph	Missing department data	Analyze largest connected component, document gaps
Visualization unreadable	Too many overlapping nodes	Use community detection to cluster

Cost Breakdown

Component	Free (NetworkX)	Mid (Gephi Pro)	Enterprise (Viva)
Graph analysis tool	$0	$0	$5K-$10K
Visualization	$0 (matplotlib)	$0 (Gephi)	Included
Developer time	3-5 days	4-5 days	1-2 days
Total	$0	$0	$5K-$10K

Anti-Patterns

Wrong: Treating all communication equally

Same edge weight for broadcasts and 1:1 messages. Result: channel spammers appear as influential hubs. [src2]

Correct: Weight by interaction intimacy

1:1 messages 3x weight, thread replies 2x, reactions 0.5x. This reflects actual relationship strength.

Wrong: Ignoring structural holes

Focusing only on bottleneck centrality. Result: missing the people who bridge disconnected groups. [src3]

Correct: Measure structural holes explicitly

Calculate Burt's constraint measure. Low-constraint nodes span structural holes — the organization's innovation bridges.

Wrong: Presenting without validation

Showing raw network graph without first validating findings. Result: false positives undermine credibility. [src1]

Correct: Validate before presenting

Share top 10 findings with 2-3 internal stakeholders before formal presentation.

When This Matters

Use when an agent needs to execute organizational network analysis from communication metadata. This is Step 3 of the OIA engagement lifecycle. Requires the combined communication metadata dataset from the data collection recipe. Output feeds into autoimmune scan and stress test recipes.