Skip to main content

ADR-205: MCP Git Status — Submodule Health Monitoring

Document: ADR-205-mcp-git-status-submodule-health
Version: 1.0.0
Purpose: Document architectural decision for submodule health monitoring MCP server
Audience: Framework contributors, DevOps engineers
Date Created: 2026-02-02
Last Updated: 2026-02-15
Status: ACCEPTED
Task ID: H.12.4
Related ADRs: ADR-114 (Protected Installation), ADR-204 (Unified Gateway)
Related Documents:
- tools/mcp-git-status/server.py
- tools/mcp-git-status/README.md

Context and Problem Statement

Background

The CODITECT rollout-master repository manages 97 git submodules across categories (core, dev, cloud, labs, products, ventures, etc.). Each submodule represents an independent repository that must be:

  1. Clean: No uncommitted changes
  2. Synced: Not behind remote
  3. Properly linked: .coditect symlink pointing to protected installation
  4. On correct branch: Typically main or master

Manual monitoring of 97 submodules is impractical. Developers need a programmatic way to assess repository health during AI-assisted sessions.

Problem Statement

How do we provide real-time, quantified health assessment of the multi-submodule repository structure to AI agents and developers, enabling proactive identification of drift, staleness, and configuration issues?

Requirements

Must-Have:

  • List all submodules with category, branch, and status
  • Individual submodule health scoring (0-100)
  • Aggregate health summary with distribution metrics
  • Detection of dirty, stale, and unpushed submodules
  • MCP protocol compliance

Should-Have:

  • Symlink verification (.coditect pointing correctly)
  • Staleness detection (days since last commit)
  • Health score distribution (excellent/good/fair/poor)

Nice-to-Have:

  • Automatic fix suggestions
  • Historical health trends
  • Integration with CI/CD for gating

Decision Drivers

Operational Need

  • O1: 97 submodules creates significant surface area for drift
  • O2: Manual git status across all submodules takes 5+ minutes
  • O3: AI agents need structured health data, not raw git output

Architecture Alignment

  • A1: Must work within the Protected Installation Directive (ADR-114)
  • A2: Symlink verification validates .coditect → protected installation
  • A3: Health scoring enables automated decision-making

Simplicity

  • S1: Pure Python + stdlib (no external dependencies beyond optional mcp)
  • S2: No database required — reads git state directly
  • S3: Stateless — every call reflects current reality

Decision Outcome

CHOSEN: Dedicated MCP Server with Health Scoring Algorithm

Rationale

  1. Quantified health: Numerical scores (0-100) enable automated thresholds
  2. Structured output: JSON responses usable by AI agents and scripts
  3. Symlink awareness: Validates CODITECT-specific configuration
  4. Zero dependencies: Pure stdlib means no installation friction

Technical Implementation

Architecture

┌─────────────────────────────────────────────────────────────────┐
│ MCP GIT STATUS SERVER │
├─────────────────────────────────────────────────────────────────┤
│ │
│ MCP Client (Claude Code) │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ MCP Tools ││
│ │ • list_submodules - All submodules with full status ││
│ │ • get_submodule_status - Single submodule deep-dive ││
│ │ • get_health_summary - Aggregate health with distribution ││
│ │ • get_dirty_submodules - Only uncommitted/untracked ││
│ │ • get_stale_submodules - Behind remote or old commits ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ SubmoduleStatus Collector ││
│ │ ││
│ │ For each submodule: ││
│ │ 1. git status --porcelain → uncommitted count ││
│ │ 2. git log --oneline HEAD..@{u} → unpushed count ││
│ │ 3. git log --oneline @{u}..HEAD → behind count ││
│ │ 4. git rev-parse --abbrev-ref HEAD → branch ││
│ │ 5. readlink .coditect → symlink verification ││
│ │ 6. calculate_health_score() → 0-100 ││
│ └─────────────────────────────────────────────────────────────┘│
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ Git CLI (subprocess) ││
│ │ ││
│ │ No database. No external libraries. Pure git + Python. ││
│ └─────────────────────────────────────────────────────────────┘│
│ │
└─────────────────────────────────────────────────────────────────┘

Health Scoring Algorithm

Each submodule starts at 100 and receives deductions:

def calculate_health_score(status: SubmoduleStatus) -> int:
"""
Health score: 100 (perfect) → 0 (critical issues).

Deductions:
─────────────────────────────────────────────
Uncommitted changes -5 each, up to -20
Unpushed commits -5 each, up to -20
Behind remote -10
Detached HEAD -15
Non-main branch -5
Missing .coditect -20
Broken .coditect link -20
Framework inaccessible -10
─────────────────────────────────────────────
Maximum deduction: -120 (clamped to 0)
"""
score = 100

# Uncommitted changes (-5 each, cap -20)
score -= min(status.uncommitted * 5, 20)

# Unpushed commits (-5 each, cap -20)
score -= min(status.unpushed * 5, 20)

# Behind remote
if status.behind > 0:
score -= 10

# Detached HEAD
if status.detached:
score -= 15

# Non-main branch
if status.branch not in ('main', 'master'):
score -= 5

# Symlink verification
if not status.symlink_ok:
score -= 20

return max(0, score)

Health Distribution

Score RangeCategoryMeaning
90-100ExcellentClean, synced, properly configured
70-89GoodMinor issues (few uncommitted changes)
50-69FairNeeds attention (behind remote, unpushed)
0-49PoorCritical issues (detached HEAD, broken symlinks)

Aggregate Health Summary

@dataclass
class HealthSummary:
total: int # Total submodules
clean: int # No issues
dirty: int # Uncommitted changes
stale: int # Behind remote
unpushed: int # Unpushed commits
avg_score: float # Average health score
excellent: int # 90-100
good: int # 70-89
fair: int # 50-69
poor: int # 0-49

The server verifies each submodule's .coditect symlink per ADR-114:

def verify_symlink(submodule_path: str) -> Tuple[bool, List[str]]:
"""
Verify .coditect symlink integrity.

Checks:
1. .coditect exists in submodule root
2. It is a symlink (not a regular file/directory)
3. Symlink target exists and is accessible
4. Target contains expected framework files

Returns (ok, issues) tuple.
"""

MCP Tool Definitions

@mcp_tool
def list_submodules() -> Dict:
"""List all submodules with full status information.
Returns name, path, category, branch, health_score, issues for each."""

@mcp_tool
def get_submodule_status(name: str) -> Dict:
"""Get detailed status for a single submodule.
Includes uncommitted count, unpushed count, behind count,
symlink status, last commit info, and health score."""

@mcp_tool
def get_health_summary() -> Dict:
"""Get aggregate health summary across all submodules.
Returns total, clean, dirty, stale counts and score distribution."""

@mcp_tool
def get_dirty_submodules() -> Dict:
"""Get only submodules with uncommitted changes.
Useful for pre-commit checks and sync operations."""

@mcp_tool
def get_stale_submodules() -> Dict:
"""Get submodules behind their remote.
Useful for identifying repos that need pulling."""

Value Proposition

Operational Intelligence

Before this server, understanding repository health required:

# Manual approach: ~5 minutes for 97 submodules
for dir in submodules/*/*; do
cd "$dir" && git status && git log --oneline HEAD..@{u} && cd -
done

After:

# Single MCP call: ~10 seconds
# Returns structured JSON with health scores, issues, and distribution

AI Agent Integration

AI agents use health data for:

  • Pre-sync checks: "Are there dirty submodules that need attention?"
  • Staleness alerts: "Which repos haven't been updated in 30+ days?"
  • Configuration validation: "Are all symlinks pointing correctly?"
  • Workflow decisions: "Is the codebase clean enough for deployment?"

The .coditect symlink is critical infrastructure (ADR-114). This server is the only automated tool that verifies symlink integrity across all 97 submodules.


How coditect-core Utilizes This Server

Direct MCP Usage

Claude Code invokes these tools during /orient sessions:

  • get_health_summary provides quick overview at session start
  • get_dirty_submodules identifies what needs attention
  • get_stale_submodules flags repos that may need pulling

Integration with Other Tools

Tool/ScriptIntegration
mcp-unified-gateway (ADR-204)Routes git status tools through single endpoint
/submodule-status commandCLI wrapper around these MCP tools
/orient commandIncludes health summary in session orientation
sync-all-submodules.shUses dirty detection for safety checks

Automated Workflows

  • Session start: /orient checks health summary for anomalies
  • Pre-sync: /sync verifies no dirty submodules before pushing
  • Health dashboards: Trajectory dashboard sources submodule metrics

Consequences

Positive

  • P1: Quantified health enables automated thresholds and alerts
  • P2: Symlink verification catches configuration drift early
  • P3: Zero external dependencies — works immediately on any system
  • P4: Structured JSON output usable by AI agents and scripts

Negative

  • N1: Health scoring is opinionated (deduction weights are hardcoded)
  • N2: Subprocess calls to git are slower than libgit2 bindings
  • N3: Not currently monitored by mcp-health-check.py
  • N4: Staleness detection requires network access for @{u} comparison

Dependencies

# Zero external dependencies (pure Python + stdlib)
# Optional: mcp>=0.9.0 for MCP server mode

Root Discovery

The server discovers the rollout-master root by searching upward from CORE_ROOT:

def find_rollout_master_root() -> Optional[Path]:
"""Find rollout-master root by looking for submodules/ directory.

Starts from CORE_ROOT (the coditect-core directory) and
searches parent directories for a 'submodules/' directory.
"""
current = Path(CORE_ROOT)
for _ in range(5): # Max 5 levels up
if (current / "submodules").is_dir():
return current
current = current.parent
return None

Files

FilePurpose
tools/mcp-git-status/server.pyMCP server implementation (686 lines)
tools/mcp-git-status/README.mdTool documentation

Validation

  • MCP server starts and registers 5 tools
  • list_submodules returns all submodules with status
  • get_submodule_status returns deep-dive for single submodule
  • get_health_summary returns aggregate metrics with distribution
  • get_dirty_submodules filters to only dirty repos
  • get_stale_submodules filters to repos behind remote
  • Health scoring produces consistent 0-100 scores
  • Symlink verification detects missing/broken .coditect links
  • Pure Python — no external dependencies required

Status: ACCEPTED Decision Date: 2026-02-02 Implementation Status: COMPLETE Task ID: H.12.4 Maintainer: CODITECT Core Team Review Date: 2026-05-02 (quarterly review)