ADR-003: Social Graph Storage Strategy

Status

Accepted

Context

The CRM needs to store and query complex relationship networks efficiently. We need to support shortest path queries, influence scoring, and community detection for millions of contacts while maintaining sub-second query performance.

Decision

Use PostgreSQL with recursive CTEs for graph storage, enhanced with Redis caching for hot paths and materialized views for analytics.

Storage Architecture

-- Primary storage in PostgreSQL
relationships (
    from_contact_id,
    to_contact_id,
    strength,
    relationship_type
)

-- Query with recursive CTEs
WITH RECURSIVE paths AS (...)

-- Cache hot paths in Redis
graph:path:{from}:{to} → JSON

Rationale

PostgreSQL Maturity: Proven, ACID compliant, familiar
Recursive CTEs: Native graph traversal support
No Additional Database: Reduces operational complexity
Good Enough Performance: <500ms for 3-degree searches
Redis Caching: Solves hot path performance

Implementation Strategy

PostgreSQL Optimizations

-- Bidirectional index for fast traversal
CREATE INDEX idx_relationships_bidirectional ON relationships(from_contact_id, to_contact_id);

-- Covering index for common queries
CREATE INDEX idx_relationships_strength ON relationships(from_contact_id, to_contact_id) INCLUDE (strength);

-- Materialized view for influence scores
CREATE MATERIALIZED VIEW contact_influence AS
    WITH RECURSIVE influence_calc AS (...);

Redis Cache Structure

# Path cache (TTL: 1 hour)
graph:path:{user}:{from}:{to} → {"nodes": [...], "distance": 3}

# Network stats (TTL: 30 minutes)  
graph:stats:{user}:{contact} → {"degree": 156, "influence": 0.73}

# Hot contacts (TTL: 5 minutes)
graph:hot:{user} → [contact_ids]

Consequences

Positive

Single database technology
ACID compliance maintained
Familiar SQL queries
Built-in backup/recovery
Cost effective

Negative

Performance limits at 10M+ edges
Complex recursive queries
No native graph algorithms
Cache invalidation complexity

Migration Path

If performance becomes an issue:

Phase 1: Current PostgreSQL + Redis
Phase 2: Add read-only Neo4j replica
Phase 3: Gradual migration to Neo4j
Phase 4: PostgreSQL for OLTP, Neo4j for OLAP

Alternatives Considered

Neo4j: Rejected - operational complexity for MVP
DGraph: Rejected - immature ecosystem
Amazon Neptune: Rejected - vendor lock-in
Pure Redis: Rejected - no durability
FoundationDB: Rejected - overkill for graph data

Status​

Context​

Decision​

Storage Architecture​

Rationale​

Implementation Strategy​

PostgreSQL Optimizations​

Redis Cache Structure​

Consequences​

Positive​

Negative​

Migration Path​

Alternatives Considered​