ADR-205: MCP Git Status — Submodule Health Monitoring

Document: ADR-205-mcp-git-status-submodule-health
Version: 1.0.0
Purpose: Document architectural decision for submodule health monitoring MCP server
Audience: Framework contributors, DevOps engineers
Date Created: 2026-02-02
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: H.12.4
Related ADRs: ADR-114 (Protected Installation), ADR-204 (Unified Gateway)
Related Documents:
  - tools/mcp-git-status/server.py
  - tools/mcp-git-status/README.md

Context and Problem Statement

Background

The CODITECT rollout-master repository manages 97 git submodules across categories (core, dev, cloud, labs, products, ventures, etc.). Each submodule represents an independent repository that must be:

Clean: No uncommitted changes
Synced: Not behind remote
Properly linked: .coditect symlink pointing to protected installation
On correct branch: Typically main or master

Manual monitoring of 97 submodules is impractical. Developers need a programmatic way to assess repository health during AI-assisted sessions.

Problem Statement

How do we provide real-time, quantified health assessment of the multi-submodule repository structure to AI agents and developers, enabling proactive identification of drift, staleness, and configuration issues?

Requirements

Must-Have:

List all submodules with category, branch, and status
Individual submodule health scoring (0-100)
Aggregate health summary with distribution metrics
Detection of dirty, stale, and unpushed submodules
MCP protocol compliance

Should-Have:

Symlink verification (.coditect pointing correctly)
Staleness detection (days since last commit)
Health score distribution (excellent/good/fair/poor)

Nice-to-Have:

Automatic fix suggestions
Historical health trends
Integration with CI/CD for gating

Decision Drivers

Operational Need

O1: 97 submodules creates significant surface area for drift
O2: Manual git status across all submodules takes 5+ minutes
O3: AI agents need structured health data, not raw git output

Architecture Alignment

A1: Must work within the Protected Installation Directive (ADR-114)
A2: Symlink verification validates .coditect → protected installation
A3: Health scoring enables automated decision-making

Simplicity

S1: Pure Python + stdlib (no external dependencies beyond optional mcp)
S2: No database required — reads git state directly
S3: Stateless — every call reflects current reality

Decision Outcome

CHOSEN: Dedicated MCP Server with Health Scoring Algorithm

Rationale

Quantified health: Numerical scores (0-100) enable automated thresholds
Structured output: JSON responses usable by AI agents and scripts
Symlink awareness: Validates CODITECT-specific configuration
Zero dependencies: Pure stdlib means no installation friction

Technical Implementation

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                   MCP GIT STATUS SERVER                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  MCP Client (Claude Code)                                        │
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ MCP Tools                                                   ││
│  │ • list_submodules     - All submodules with full status      ││
│  │ • get_submodule_status - Single submodule deep-dive         ││
│  │ • get_health_summary  - Aggregate health with distribution  ││
│  │ • get_dirty_submodules - Only uncommitted/untracked         ││
│  │ • get_stale_submodules - Behind remote or old commits       ││
│  └─────────────────────────────────────────────────────────────┘│
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ SubmoduleStatus Collector                                   ││
│  │                                                              ││
│  │  For each submodule:                                         ││
│  │  1. git status --porcelain → uncommitted count               ││
│  │  2. git log --oneline HEAD..@{u} → unpushed count            ││
│  │  3. git log --oneline @{u}..HEAD → behind count              ││
│  │  4. git rev-parse --abbrev-ref HEAD → branch                 ││
│  │  5. readlink .coditect → symlink verification                ││
│  │  6. calculate_health_score() → 0-100                         ││
│  └─────────────────────────────────────────────────────────────┘│
│       │                                                          │
│       ▼                                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ Git CLI (subprocess)                                        ││
│  │                                                              ││
│  │ No database. No external libraries. Pure git + Python.      ││
│  └─────────────────────────────────────────────────────────────┘│
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Health Scoring Algorithm

Each submodule starts at 100 and receives deductions:

def calculate_health_score(status: SubmoduleStatus) -> int:
    """
    Health score: 100 (perfect) → 0 (critical issues).

    Deductions:
    ─────────────────────────────────────────────
    Uncommitted changes     -5 each, up to -20
    Unpushed commits        -5 each, up to -20
    Behind remote           -10
    Detached HEAD           -15
    Non-main branch         -5
    Missing .coditect       -20
    Broken .coditect link   -20
    Framework inaccessible  -10
    ─────────────────────────────────────────────
    Maximum deduction: -120 (clamped to 0)
    """
    score = 100

    # Uncommitted changes (-5 each, cap -20)
    score -= min(status.uncommitted * 5, 20)

    # Unpushed commits (-5 each, cap -20)
    score -= min(status.unpushed * 5, 20)

    # Behind remote
    if status.behind > 0:
        score -= 10

    # Detached HEAD
    if status.detached:
        score -= 15

    # Non-main branch
    if status.branch not in ('main', 'master'):
        score -= 5

    # Symlink verification
    if not status.symlink_ok:
        score -= 20

    return max(0, score)

Health Distribution

Score Range	Category	Meaning
90-100	Excellent	Clean, synced, properly configured
70-89	Good	Minor issues (few uncommitted changes)
50-69	Fair	Needs attention (behind remote, unpushed)
0-49	Poor	Critical issues (detached HEAD, broken symlinks)

Aggregate Health Summary

@dataclass
class HealthSummary:
    total: int              # Total submodules
    clean: int              # No issues
    dirty: int              # Uncommitted changes
    stale: int              # Behind remote
    unpushed: int           # Unpushed commits
    avg_score: float        # Average health score
    excellent: int          # 90-100
    good: int               # 70-89
    fair: int               # 50-69
    poor: int               # 0-49

Symlink Verification

The server verifies each submodule's .coditect symlink per ADR-114:

def verify_symlink(submodule_path: str) -> Tuple[bool, List[str]]:
    """
    Verify .coditect symlink integrity.

    Checks:
    1. .coditect exists in submodule root
    2. It is a symlink (not a regular file/directory)
    3. Symlink target exists and is accessible
    4. Target contains expected framework files

    Returns (ok, issues) tuple.
    """

MCP Tool Definitions

@mcp_tool
def list_submodules() -> Dict:
    """List all submodules with full status information.
    Returns name, path, category, branch, health_score, issues for each."""

@mcp_tool
def get_submodule_status(name: str) -> Dict:
    """Get detailed status for a single submodule.
    Includes uncommitted count, unpushed count, behind count,
    symlink status, last commit info, and health score."""

@mcp_tool
def get_health_summary() -> Dict:
    """Get aggregate health summary across all submodules.
    Returns total, clean, dirty, stale counts and score distribution."""

@mcp_tool
def get_dirty_submodules() -> Dict:
    """Get only submodules with uncommitted changes.
    Useful for pre-commit checks and sync operations."""

@mcp_tool
def get_stale_submodules() -> Dict:
    """Get submodules behind their remote.
    Useful for identifying repos that need pulling."""

Value Proposition

Operational Intelligence

Before this server, understanding repository health required:

# Manual approach: ~5 minutes for 97 submodules
for dir in submodules/*/*; do
  cd "$dir" && git status && git log --oneline HEAD..@{u} && cd -
done

After:

# Single MCP call: ~10 seconds
# Returns structured JSON with health scores, issues, and distribution

AI Agent Integration

AI agents use health data for:

Pre-sync checks: "Are there dirty submodules that need attention?"
Staleness alerts: "Which repos haven't been updated in 30+ days?"
Configuration validation: "Are all symlinks pointing correctly?"
Workflow decisions: "Is the codebase clean enough for deployment?"

Symlink Compliance

The .coditect symlink is critical infrastructure (ADR-114). This server is the only automated tool that verifies symlink integrity across all 97 submodules.

How coditect-core Utilizes This Server

Direct MCP Usage

Claude Code invokes these tools during /orient sessions:

get_health_summary provides quick overview at session start
get_dirty_submodules identifies what needs attention
get_stale_submodules flags repos that may need pulling

Integration with Other Tools

Tool/Script	Integration
mcp-unified-gateway (ADR-204)	Routes git status tools through single endpoint
`/submodule-status` command	CLI wrapper around these MCP tools
`/orient` command	Includes health summary in session orientation
`sync-all-submodules.sh`	Uses dirty detection for safety checks

Automated Workflows

Session start: /orient checks health summary for anomalies
Pre-sync: /sync verifies no dirty submodules before pushing
Health dashboards: Trajectory dashboard sources submodule metrics

Consequences

Positive

P1: Quantified health enables automated thresholds and alerts
P2: Symlink verification catches configuration drift early
P3: Zero external dependencies — works immediately on any system
P4: Structured JSON output usable by AI agents and scripts

Negative

N1: Health scoring is opinionated (deduction weights are hardcoded)
N2: Subprocess calls to git are slower than libgit2 bindings
N3: Not currently monitored by mcp-health-check.py
N4: Staleness detection requires network access for @{u} comparison

Dependencies

# Zero external dependencies (pure Python + stdlib)
# Optional: mcp>=0.9.0 for MCP server mode

Root Discovery

The server discovers the rollout-master root by searching upward from CORE_ROOT:

def find_rollout_master_root() -> Optional[Path]:
    """Find rollout-master root by looking for submodules/ directory.

    Starts from CORE_ROOT (the coditect-core directory) and
    searches parent directories for a 'submodules/' directory.
    """
    current = Path(CORE_ROOT)
    for _ in range(5):  # Max 5 levels up
        if (current / "submodules").is_dir():
            return current
        current = current.parent
    return None

Files

File	Purpose
`tools/mcp-git-status/server.py`	MCP server implementation (686 lines)
`tools/mcp-git-status/README.md`	Tool documentation

Validation

MCP server starts and registers 5 tools
list_submodules returns all submodules with status
get_submodule_status returns deep-dive for single submodule
get_health_summary returns aggregate metrics with distribution
get_dirty_submodules filters to only dirty repos
get_stale_submodules filters to repos behind remote
Health scoring produces consistent 0-100 scores
Symlink verification detects missing/broken .coditect links
Pure Python — no external dependencies required

Status: ACCEPTED Decision Date: 2026-02-02 Implementation Status: COMPLETE Task ID: H.12.4 Maintainer: CODITECT Core Team Review Date: 2026-05-02 (quarterly review)

Context and Problem Statement​

Background​

Problem Statement​

Requirements​

Decision Drivers​

Operational Need​

Architecture Alignment​

Simplicity​

Decision Outcome​

Rationale​

Technical Implementation​

Architecture​

Health Scoring Algorithm​

Health Distribution​

Aggregate Health Summary​

Symlink Verification​

MCP Tool Definitions​

Value Proposition​

Operational Intelligence​

AI Agent Integration​

Symlink Compliance​

How coditect-core Utilizes This Server​

Direct MCP Usage​

Integration with Other Tools​

Automated Workflows​

Consequences​

Positive​

Negative​

Dependencies​

Root Discovery​

Files​

Validation​