ADR-004: Symlink Resolution Strategy
Status: Accepted Date: 2025-11-30 Deciders: Architecture Team, Product Team Tags: licensing, symlinks, session-id, fairness
Context
CODITECT-CORE is distributed as a git submodule that uses symlink chains for distributed intelligence across multiple projects and submodules. This creates a licensing challenge: how do we fairly count sessions when the same physical .coditect/ directory is accessed via multiple symlink paths?
CODITECT Symlink Architecture
Typical Project Structure:
coditect-rollout-master/ # Master project
├── .coditect/ # Physical directory (git submodule)
│ ├── scripts/init.sh # License enforcement point
│ ├── agents/ # 52 specialized agents
│ └── sdk/license_client.py # License SDK
├── .claude -> .coditect # Symlink for Claude Code
│
├── submodules/cloud/coditect-cloud-backend/
│ ├── .coditect -> ../../.coditect # Symlink to parent
│ └── .claude -> .coditect # Symlink chain
│
├── submodules/cloud/coditect-cloud-frontend/
│ ├── .coditect -> ../../.coditect # SAME RESOLVED PATH
│ └── .claude -> .coditect
│
└── submodules/core/coditect-core/
├── .coditect -> ../../.coditect # SAME RESOLVED PATH
└── .claude -> .coditect
The Licensing Dilemma
Scenario 1: Naive Path-Based Session ID
# BAD: Using CWD or symlink path
session_id = hash(os.getcwd()) # Different for each submodule
Result:
- Parent: cwd=/Users/dev/coditect-rollout-master → session_1
- Backend: cwd=/Users/dev/.../coditect-cloud-backend → session_2
- Frontend: cwd=/Users/dev/.../coditect-cloud-frontend → session_3
Problem: Same developer, same project, 3 license seats consumed ❌
Scenario 2: Physical Path Resolution
# GOOD: Resolve symlinks to physical path
coditect_path = os.path.realpath('.coditect')
session_id = hash(coditect_path + project_root)
Result:
- Parent: realpath=/Users/dev/coditect-rollout-master/.coditect → session_1
- Backend: realpath=/Users/dev/coditect-rollout-master/.coditect → session_1
- Frontend: realpath=/Users/dev/coditect-rollout-master/.coditect → session_1
Solution: Same developer, same project, 1 license seat consumed ✅
Business Requirements
Fair Pricing:
- 1 developer working on 1 project = 1 license seat
- Multiple symlinks to same physical
.coditect/= 1 session - Multiple projects (different git repositories) = different sessions
Prevent Abuse:
- Copying
.coditect/to bypass licensing = NOT allowed - Sharing license across different hardware = NOT allowed
- Same hardware, different projects = separate sessions (fair)
Technical Constraints:
- Must work on Linux, macOS, Windows (cross-platform)
- Must work in Docker containers (volume mounts)
- Must handle broken symlinks gracefully (don't crash)
- Must be deterministic (same input = same session_id)
User Experience Requirements
Developer Expectations:
Scenario A: Monorepo with Submodules
- Project: coditect-rollout-master (master + 46 submodules)
- Expected Seats: 1 (all symlinks point to same .coditect)
- Developer Expectation: "I'm working on ONE project"
Scenario B: Multiple Independent Projects
- Project 1: client-app-1 (with .coditect)
- Project 2: client-app-2 (with .coditect, different git repo)
- Expected Seats: 2 (different projects, different .coditect copies)
- Developer Expectation: "I have TWO separate projects"
Scenario C: Docker Volume Mount
- Host: /Users/dev/project/.coditect
- Container: /app/.coditect (mounted from host)
- Expected Seats: 1 (same physical directory)
- Developer Expectation: "Docker is just a runtime environment"
Decision
We will use os.path.realpath() to resolve symlinks to their physical paths when generating session IDs.
Session ID generation will combine:
- Resolved
.coditect/path - Handles symlinks correctly - Project root (git repository root) - Distinguishes separate projects
- Hardware fingerprint - Prevents cross-hardware sharing
- User email - Tracks who's using the license
Session ID Generation Algorithm
Implementation:
import os
import hashlib
import json
import subprocess
def generate_session_id():
"""
Generate stable session ID that treats symlinks fairly.
Returns:
str: SHA256 hash representing unique session
"""
# Step 1: Resolve .coditect/ symlinks to physical path
coditect_path = os.path.realpath('.coditect')
# Explanation:
# - os.path.realpath() follows ALL symlinks to final physical path
# - Parent project: '.coditect' → /Users/dev/master/.coditect
# - Submodule: '../../.coditect' → /Users/dev/master/.coditect (SAME)
# - Result: Both resolve to identical path
# Step 2: Get project root (git repository root)
try:
project_root = subprocess.check_output(
['git', 'rev-parse', '--show-toplevel'],
cwd=os.getcwd(),
stderr=subprocess.DEVNULL
).decode().strip()
except:
# Fallback: Use current working directory if not in git repo
project_root = os.getcwd()
# Explanation:
# - Git repository root distinguishes separate projects
# - Master: /Users/dev/coditect-rollout-master
# - Client 1: /Users/dev/client-app-1 (different root)
# - Client 2: /Users/dev/client-app-2 (different root)
# Step 3: Get hardware fingerprint
hardware_id = get_hardware_id() # MAC + CPU + machine UUID
# Explanation:
# - Prevents license sharing across different machines
# - Same developer, same laptop, different projects → different sessions OK
# - Different developers, different laptops → different hardware_id
# Step 4: Get user email (from git config)
try:
user_email = subprocess.check_output(
['git', 'config', 'user.email'],
stderr=subprocess.DEVNULL
).decode().strip()
except:
user_email = os.getenv('USER', 'unknown') + '@localhost'
# Step 5: Combine all factors
session_data = {
'coditect_path': coditect_path, # Resolved physical path
'project_root': project_root, # Git repo root
'hardware_id': hardware_id, # Hardware fingerprint
'user_email': user_email, # User identification
'coditect_version': get_version(), # Framework version
'usage_type': 'builder' # 'builder' or 'runtime'
}
# Step 6: Generate stable hash
session_id = hashlib.sha256(
json.dumps(session_data, sort_keys=True).encode()
).hexdigest()
return session_id
def get_hardware_id():
"""
Generate hardware fingerprint (cross-platform).
Returns:
str: SHA256 hash of hardware identifiers
"""
import uuid
import platform
# MAC address (primary network interface)
mac = ':'.join(['{:02x}'.format((uuid.getnode() >> ele) & 0xff)
for ele in range(0, 8*6, 8)][::-1])
# CPU info (platform-specific)
if platform.system() == 'Darwin': # macOS
cpu_info = subprocess.check_output(
['sysctl', '-n', 'machdep.cpu.brand_string']
).decode().strip()
elif platform.system() == 'Linux':
with open('/proc/cpuinfo', 'r') as f:
cpu_lines = [line for line in f if 'model name' in line]
cpu_info = cpu_lines[0].split(':')[1].strip() if cpu_lines else 'unknown'
else: # Windows or other
cpu_info = platform.processor()
# Machine UUID
machine_uuid = str(uuid.UUID(int=uuid.getnode()))
# Combine and hash
hardware_data = f"{mac}:{cpu_info}:{machine_uuid}"
hardware_id = hashlib.sha256(hardware_data.encode()).hexdigest()[:32]
return hardware_id
def get_version():
"""Get CODITECT version from .coditect/VERSION."""
try:
with open('.coditect/VERSION', 'r') as f:
return f.read().strip()
except:
return '1.0.0'
Verification Test Cases
Test Case 1: Symlink Chain (Expected: Same Session ID)
import os
import tempfile
def test_symlink_resolution():
"""Verify symlinks resolve to same session_id."""
with tempfile.TemporaryDirectory() as tmpdir:
# Create physical .coditect directory
physical_dir = os.path.join(tmpdir, 'master', '.coditect')
os.makedirs(physical_dir)
# Create VERSION file
with open(os.path.join(physical_dir, 'VERSION'), 'w') as f:
f.write('1.0.0')
# Create symlinks (simulating submodules)
submodule1 = os.path.join(tmpdir, 'submodule1')
os.makedirs(submodule1)
os.symlink(
os.path.join('..', 'master', '.coditect'),
os.path.join(submodule1, '.coditect')
)
submodule2 = os.path.join(tmpdir, 'submodule2')
os.makedirs(submodule2)
os.symlink(
os.path.join('..', 'master', '.coditect'),
os.path.join(submodule2, '.coditect')
)
# Generate session IDs from different symlink paths
os.chdir(os.path.join(tmpdir, 'master'))
session_id_master = generate_session_id()
os.chdir(submodule1)
session_id_sub1 = generate_session_id()
os.chdir(submodule2)
session_id_sub2 = generate_session_id()
# Verify all resolve to SAME session ID
assert session_id_master == session_id_sub1 == session_id_sub2, \
"Symlinks must resolve to same session ID"
print("✅ Test passed: Symlinks correctly resolve to same session")
Test Case 2: Different Projects (Expected: Different Session IDs)
def test_different_projects():
"""Verify different git repositories have different session IDs."""
with tempfile.TemporaryDirectory() as tmpdir:
# Project 1
project1 = os.path.join(tmpdir, 'project1')
os.makedirs(os.path.join(project1, '.coditect'))
subprocess.run(['git', 'init'], cwd=project1)
# Project 2
project2 = os.path.join(tmpdir, 'project2')
os.makedirs(os.path.join(project2, '.coditect'))
subprocess.run(['git', 'init'], cwd=project2)
# Generate session IDs
os.chdir(project1)
session_id_1 = generate_session_id()
os.chdir(project2)
session_id_2 = generate_session_id()
# Verify DIFFERENT session IDs (different project roots)
assert session_id_1 != session_id_2, \
"Different projects must have different session IDs"
print("✅ Test passed: Different projects have different sessions")
Test Case 3: Docker Volume Mount (Expected: Same Session ID)
def test_docker_volume_mount():
"""
Verify Docker volume mounts resolve correctly.
Simulates:
Host: /Users/dev/project/.coditect
Container: /app/.coditect (mounted from host)
Expected: Same physical inode → same session ID
"""
# NOTE: In Docker, volume mounts preserve inodes
# os.path.realpath() resolves to mounted path, but inode is shared
# Session ID will be based on CONTAINER's view of path
# For fair licensing:
# - Host session: /Users/dev/project/.coditect → session_1
# - Container session: /app/.coditect (same inode) → session_2
# - BUT: different project_root prevents double-counting
# Conclusion: Docker container = separate environment = separate session (fair)
pass # Tested in integration tests with actual Docker
Edge Cases Handled
1. Broken Symlinks
# Scenario: Symlink points to non-existent target
# .coditect -> /missing/path
try:
coditect_path = os.path.realpath('.coditect')
except OSError:
# Fallback: Use symlink path itself
coditect_path = os.path.abspath('.coditect')
2. Circular Symlinks
# Scenario: a -> b -> c -> a (infinite loop)
# os.path.realpath() handles this:
# - Detects circular reference
# - Raises OSError or returns partial resolution
# - Fallback to os.path.abspath()
3. Relative vs. Absolute Symlinks
# Both work correctly with os.path.realpath():
# Relative: .coditect -> ../../master/.coditect
# Absolute: .coditect -> /Users/dev/master/.coditect
# os.path.realpath() resolves both to same physical path
4. Cross-Filesystem Symlinks
# Scenario: Symlink crosses filesystem boundaries
# .coditect -> /mnt/external/.coditect
# Still works: os.path.realpath() follows across filesystems
# Session ID will be unique per physical path
Consequences
Positive
✅ Fair Pricing
- Symlink chains correctly resolve to single session
- 1 developer, 1 project, N symlinks = 1 seat (fair)
- Monorepo with 46 submodules = 1 seat (developer expectation met)
✅ Prevents Double-Counting
- No billing for symlink architecture overhead
- Same physical
.coditect/directory = same session - Transparent to developers (no need to understand symlink internals)
✅ Cross-Platform Compatibility
os.path.realpath()works on Linux, macOS, Windows- Docker volume mounts handled correctly
- No platform-specific hacks required
✅ Deterministic Session IDs
- Same inputs always produce same session_id
- Idempotent license acquisition (retry-safe)
- No race conditions in session ID generation
✅ Handles Multiple Projects
- Different git repositories = different project_root
- Allows developer to work on multiple client projects
- Each project gets separate session (fair usage)
✅ Security: Prevents Abuse
- Hardware fingerprint prevents cross-machine sharing
- Copying
.coditect/creates different physical path (new session) - User email tracks who's using license (audit trail)
Negative
⚠️ Docker Containers = Separate Sessions
- Container has different view of filesystem
- Host session != container session (different paths)
- Mitigation: Expected behavior (container is separate environment)
- Acceptable: Most developers use EITHER host OR container, not both simultaneously
⚠️ Symlink Performance Overhead
os.path.realpath()requires filesystem traversal- Adds ~1-5ms per call (negligible)
- Mitigation: Cache resolved path for session duration
⚠️ Hard Links Not Detected
- Hard links (different paths, same inode) treated as different sessions
- Rare edge case (hard links uncommon for directories)
- Acceptable: Hard links discouraged in git workflows
⚠️ Project Root Detection Requires Git
git rev-parse --show-toplevelfails if not in git repo- Mitigation: Fallback to current working directory
- Acceptable: CODITECT installed as git submodule (git always present)
Neutral
🔄 Session ID Includes Multiple Factors
- More complex than simple path hash
- Better security and fairness trade-off
- Acceptable: Complexity hidden from developers
🔄 License Cache Tied to Session ID
- Changing hardware or project requires new license acquisition
- Expected behavior (different session = different license check)
- Acceptable: Cached licenses reduce re-validation frequency
Alternatives Considered
Alternative 1: Current Working Directory (CWD)
Implementation:
session_id = hash(os.getcwd())
Pros:
- ✅ Simplest implementation
- ✅ Fast (no filesystem traversal)
Cons:
- ❌ Different for each submodule (overbilling)
- ❌ Unfair to developers (penalized for symlink architecture)
- ❌ Developer expectation mismatch
Rejected Because: Overbills developers with submodule-heavy projects.
Alternative 2: Symlink Path (No Resolution)
Implementation:
session_id = hash(os.path.abspath('.coditect'))
Pros:
- ✅ Simple
- ✅ Fast
Cons:
- ❌ Symlinks NOT resolved (different paths for same physical directory)
- ❌ Same problem as Alternative 1
Rejected Because: Doesn't handle symlinks correctly.
Alternative 3: Inode-Based Session ID
Implementation:
import os
stat_info = os.stat('.coditect')
session_id = hash(stat_info.st_ino) # Inode number
Pros:
- ✅ Symlinks and hard links both resolve to same inode
- ✅ Most accurate physical identity
Cons:
- ❌ Inodes not portable across filesystems
- ❌ Docker volume mounts may have different inodes
- ❌ Windows doesn't have inodes (NTFS uses FileID)
- ❌ Doesn't distinguish different projects with same .coditect copy
Rejected Because: Not cross-platform, doesn't include project context.
Alternative 4: Git Submodule Hash
Implementation:
# Use git submodule commit hash as session ID
submodule_hash = subprocess.check_output(
['git', 'rev-parse', 'HEAD:.coditect']
).decode().strip()
session_id = hash(submodule_hash)
Pros:
- ✅ Git-native approach
- ✅ Tracks .coditect version
Cons:
- ❌ Requires git (fails outside git repos)
- ❌ Changes when .coditect/ updated (breaks cached licenses)
- ❌ Doesn't include hardware or user identity (abuse risk)
Rejected Because: Too fragile (breaks on .coditect updates).
Implementation Notes
Caching Resolved Path
Optimization:
# Cache resolved path for session duration
_resolved_path_cache = None
def get_resolved_coditect_path():
"""Get cached resolved .coditect path."""
global _resolved_path_cache
if _resolved_path_cache is None:
_resolved_path_cache = os.path.realpath('.coditect')
return _resolved_path_cache
Benefit: Avoid repeated filesystem traversal (5ms → <0.01ms).
Cross-Platform Testing
Test Matrix:
| Platform | Symlink Support | os.path.realpath() | Status |
|---|---|---|---|
| Linux | ✅ Native | ✅ Works | ✅ Tested |
| macOS | ✅ Native | ✅ Works | ✅ Tested |
| Windows | ⚠️ NTFS symlinks require admin | ✅ Works with admin privileges | ⚠️ Limited testing |
| Docker (Linux) | ✅ Native | ✅ Works | ✅ Tested |
| Docker (Windows) | ⚠️ Varies by mount type | ⚠️ May require bind mounts | ⚠️ Edge case |
Recommendation: Encourage Linux/macOS for development, Windows testing required for v1.0.
Debugging Session ID Generation
Debug Mode:
import os
import logging
def generate_session_id(debug=False):
"""Generate session ID with optional debugging."""
coditect_path = os.path.realpath('.coditect')
project_root = get_project_root()
hardware_id = get_hardware_id()
user_email = get_user_email()
if debug:
logging.info(f"Session ID Components:")
logging.info(f" coditect_path: {coditect_path}")
logging.info(f" project_root: {project_root}")
logging.info(f" hardware_id: {hardware_id[:8]}...")
logging.info(f" user_email: {user_email}")
session_data = {
'coditect_path': coditect_path,
'project_root': project_root,
'hardware_id': hardware_id,
'user_email': user_email,
'coditect_version': get_version(),
'usage_type': 'builder'
}
session_id = hashlib.sha256(
json.dumps(session_data, sort_keys=True).encode()
).hexdigest()
if debug:
logging.info(f" session_id: {session_id}")
return session_id
Usage:
CODITECT_DEBUG=1 python3 .coditect/scripts/init.sh
Related ADRs
- ADR-001: Floating Licenses vs. Node-Locked Licenses (session ID usage in seat management)
- ADR-002: Redis Lua Scripts for Atomic Operations (session_id as Redis key)
- ADR-003: Check-on-Init Enforcement Pattern (when session_id is generated)
- ADR-005: Builder vs. Runtime Licensing Model (usage_type field in session_id)
References
- Python os.path.realpath() Documentation
- Git Submodules Documentation
- Symlink Best Practices
- Docker Volume Mounts and Symlinks
- license-server-sdd.md - Session ID usage in license enforcement
Last Updated: 2025-11-30 Owner: Architecture Team, Product Team Review Cycle: Quarterly or on major licensing changes