File-Monitor Purpose and Integration

What is the File-Monitor?

The file-monitor is a production-grade Rust library that provides real-time file system monitoring with enterprise features like rate limiting, debouncing, checksums, and observability.

Core Purpose

Track ALL file changes in the workspace and provide detailed audit trails for:

The 5 W's + H

WHO changed it?
- User ID
- Process name
- Agent ID (for automated changes)
WHAT changed?
- File path
- Event type (create, modify, delete, rename, move)
- File size
- SHA-256 checksum (for integrity verification)
WHEN did it change?
- UTC timestamp
- ISO 8601 format
WHERE did it happen?
- Absolute file path
- workspace context
WHY did it change?
- Linked to user action (commit message)
- Linked to agent task (task description)
- Parent event tracking
HOW did it change?
- Before/after states (via diffs)
- Modification type (content, metadata, permissions)

Integration with AZ1.AI IDE

Architecture Context

From ADR-022: Audit Logging Architecture, the file-monitor is part of a comprehensive audit system:

┌────────────────────────────────────────────────────────────────┐
│                      Application Layer                         │
│  (theia IDE, Agent System, User Actions)                      │
└────────────────────┬───────────────────────────────────────────┘
                     │
                     ▼
┌────────────────────────────────────────────────────────────────┐
│                  AuditLogService (TypeScript)                  │
│  • logFileChange()                                             │
│  • logUserAction()                                             │
│  • logEvent()                                                  │
└────────────────────┬───────────────────────────────────────────┘
                     │
         ┌───────────┼───────────┐
         │           │           │
         ▼           ▼           ▼
┌────────────┐  ┌────────────┐  ┌────────────┐
│    Diff    │  │Foundation  │  │   Cloud    │
│ Generator  │  │    DB      │  │  Logging   │
└────────────┘  └────────────┘  └────────────┘
                     ▲
                     │
┌────────────────────┴───────────────────────────────────────────┐
│           File Change Monitors (File-Monitor Layer)            │
│                                                                │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐        │
│  │ file-monitor │  │   chokidar   │  │    OPFS      │        │
│  │   (Rust)     │  │ (TypeScript) │  │   Watcher    │        │
│  │              │  │              │  │ (Browser)    │        │
│  │ • inotify    │  │ • Fallback   │  │ • OPFS API   │        │
│  │ • FSEvents   │  │ • Cross-FS   │  │ • IndexedDB  │        │
│  │ • Checksums  │  │              │  │              │        │
│  └──────────────┘  └──────────────┘  └──────────────┘        │
└────────────────────────────────────────────────────────────────┘

Why Rust Instead of Pure TypeScript?

The file-monitor is written in Rust (not TypeScript/Node.js) because:

Performance: Native OS APIs (inotify, FSEvents, ReadDirectoryChangesW) via notify crate
Resource Efficiency: Bounded memory usage, no garbage collection pauses
Checksums: Streaming SHA-256 calculation without loading entire files into memory
Reliability: Type safety, no runtime errors, production-grade error handling
Rate Limiting: Semaphore-based backpressure to prevent event floods
Observability: Built-in Prometheus metrics for monitoring

Integration Strategy

Option 1: IPC (Recommended for MVP)

// TypeScript (Node.js) spawns Rust binary as subprocess
import { spawn } from 'child_process';

const monitor = spawn('./file-monitor', ['/workspace'], {
  stdio: ['pipe', 'pipe', 'pipe']
});

monitor.stdout.on('data', (data) => {
  const event = JSON.parse(data.toString());
  auditLogService.logFileChange(event);
});

Option 2: FFI (Better Performance)

// Use napi-rs to create Node.js native module
import { FileMonitor } from '@az1ai/file-monitor';

const monitor = new FileMonitor('/workspace', {
  recursive: true,
  debounce: 500,
  checksums: true
});

monitor.on('event', (event) => {
  auditLogService.logFileChange(event);
});

Option 3: WebAssembly (Browser-Only)

Compile to WASM for browser environments
Limited file system access (requires OPFS/File System Access API)
Best for OPFS watching inside browser

Use Cases in AZ1.AI

1. Agent File Operations Tracking

Scenario: Code Generation Agent creates new files

// Agent creates file
await fs.writeFile('/workspace/src/newFeature.ts', code);

// File-monitor detects change
// Event emitted:
{
  event_type: 'Created',
  file_path: '/workspace/src/newFeature.ts',
  user_id: null,
  process_name: 'node',
  checksum: 'sha256:abc123...',
  file_size: 2048,
  timestamp_utc: '2025-10-06T12:34:56Z'
}

// AuditLogService records:
{
  event_id: 'uuid-123',
  event_type: 'FILE_CREATE',
  actor_type: 'agent',
  actor_id: 'code-generation-agent',
  resource_path: '/workspace/src/newFeature.ts',
  reason: 'Implementing user story #42: Add login feature',
  metadata: {
    agent_task_id: 'task-456',
    llm_model: 'qwen/qwq-32b',
    checksum: 'sha256:abc123...'
  }
}

2. User File Edits

Scenario: User manually edits file in theia editor

// File-monitor detects modification
{
  event_type: 'Modified',
  modification_type: 'Content',
  file_path: '/workspace/src/auth.ts',
  user_id: 'user-789',
  checksum: 'sha256:def456...',
  timestamp_utc: '2025-10-06T12:35:01Z'
}

// AuditLogService generates diff and records
{
  event_type: 'FILE_MODIFY',
  actor_type: 'user',
  actor_id: 'user-789',
  before_state: { checksum: 'sha256:abc123...' },
  after_state: { checksum: 'sha256:def456...' },
  diff: '@@ -10,7 +10,7 @@\n-const secret = "old"\n+const secret = process.env.SECRET\n',
  reason: 'Fixed hardcoded secret vulnerability'
}

3. Multi-Agent Collaboration

Scenario: Code Review Agent detects security issue from Code Gen Agent

// Code Gen Agent creates file (event 1)
// File-monitor emits create event

// Code Review Agent reviews file (triggered by event 1)
// Finds security issue

// Code Review Agent modifies file (event 2)
// File-monitor emits modify event

// AuditLogService links events
{
  event_id: 'event-2',
  parent_event_id: 'event-1',  // Links to creation
  event_type: 'FILE_MODIFY',
  actor_type: 'agent',
  actor_id: 'code-review-agent',
  reason: 'Security fix: Removed hardcoded API key',
  metadata: {
    triggered_by: 'code-generation-agent',
    security_issue: 'CWE-798: Hardcoded Credentials'
  }
}

4. Compliance & Audit Trail

Scenario: GDPR audit requires proof of data handling

// Query audit logs for all file operations on user data
const auditTrail = await auditLogService.query({
  resource_path: '/workspace/src/users/*.ts',
  start_time: '2025-01-01',
  end_time: '2025-12-31'
});

// Returns complete chain of custody:
// - When: Every timestamp
// - Who: Every user/agent that touched the files
// - What: Every change (with diffs)
// - Why: Every reason (commit messages, task descriptions)
// - How: Every checksum verifying integrity

Key Features for AZ1.AI

1. Rate Limiting

Prevents event floods when agent makes bulk changes:

Semaphore-based concurrency control
Configurable max concurrent tasks
Backpressure handling (drop vs. wait)

2. Debouncing

Reduces duplicate events when editor auto-saves:

Time-window deduplication (500ms default)
70-90% event reduction
Configurable per-file or globally

3. Checksums

Verify file integrity and detect tampering:

Streaming SHA-256 (constant memory)
Configurable size limit (100MB default)
Used for deduplication and audit trails

4. Observability

Monitor system health and performance:

Prometheus metrics (events/sec, latency, dropped events)
Structured tracing (debug, info, warn, error)
Health checks for alerting

5. Graceful Shutdown

Ensure no events are lost during restarts:

Coordinated lifecycle with drain timeout
Complete in-flight events before shutdown
Signal handling (SIGTERM, SIGINT)

Performance Characteristics

From production benchmarks (docs/file-monitor/production.md):

Scenario	Events/sec	CPU	Memory	Latency (p99)
Idle monitoring	0	<1%	5 MB	N/A
Light (10 evt/s)	10	2%	10 MB	2ms
Moderate (100 evt/s)	100	8%	25 MB	5ms
Heavy (1000 evt/s)	1000	25%	60 MB	15ms
With checksums (100 evt/s)	100	15%	30 MB	25ms

Key Insight: Can handle 1000+ events/sec (enough for bulk agent operations) with <100ms latency.

Configuration for AZ1.AI

Recommended Settings

let config = MonitorConfig::new("/workspace")
    .recursive(true)                          // Watch all subdirectories
    .debounce(500)                           // 500ms debounce window
    .concurrency(100, 1000)                  // Max 100 concurrent, buffer 1000
    .with_checksums(Some(50 * 1024 * 1024)) // 50MB checksum limit
    .ignore_patterns(vec![
        "*.tmp".to_string(),
        "*.swp".to_string(),
        ".git/*".to_string(),
        "node_modules/*".to_string(),
        "__pycache__/*".to_string(),
        ".DS_Store".to_string(),
        "*.log".to_string(),
    ]);

Why These Settings?

recursive: true - Monitor entire workspace including agent-created subdirectories
debounce: 500ms - Balance responsiveness with deduplication (editor auto-save ~300ms)
concurrency: 100/1000 - Handle agent bulk operations (create 50 files at once)
checksums: 50MB limit - Calculate for source files, skip for large binaries/videos
ignore_patterns - Skip temp files, version control, dependencies

Next Steps for Integration

Phase 1: IPC Integration (MVP)

✅ Build Rust binary (DONE)
Create TypeScript wrapper service
Spawn file-monitor as subprocess
Parse JSON events from stdout
Forward to AuditLogService

Phase 2: Event Enrichment

Link file events to user sessions
Link to agent tasks
Generate diffs (before/after states)
Store in FoundationDB

Phase 3: FFI Module (Performance)

Create napi-rs bindings
Compile as Node.js native module
Direct function calls (no IPC overhead)
Shared memory for high throughput

Phase 4: Cloud Sync

Replicate audit logs to cloud storage
Enable cross-device audit trails
Compliance reporting dashboards

Summary

File-Monitor provides the foundation for comprehensive audit logging in AZ1.AI:

✅ Tracks WHO (user/agent attribution)
✅ Tracks WHAT (file paths, event types, checksums)
✅ Tracks WHEN (timestamps)
✅ Tracks WHERE (workspace paths)
⏳ Needs WHY (integration with task system)
⏳ Needs HOW (diff generation)

The Rust implementation provides production-grade reliability and performance that TypeScript alone cannot match, especially for high-frequency file operations from multi-agent systems.

Related Documents:

ADR-022: Audit Logging Architecture
ADR-023: File Change Tracking
ADR-018: Local Filesystem Integration
docs/file-monitor/README.md - Usage guide
docs/file-monitor/production.md - Deployment guide

What is the File-Monitor?​

Core Purpose​

The 5 W's + H​

Integration with AZ1.AI IDE​

Architecture Context​

Why Rust Instead of Pure TypeScript?​

Integration Strategy​

Use Cases in AZ1.AI​

1. Agent File Operations Tracking​

2. User File Edits​

3. Multi-Agent Collaboration​

4. Compliance & Audit Trail​

Key Features for AZ1.AI​

1. Rate Limiting​

2. Debouncing​

3. Checksums​

4. Observability​

5. Graceful Shutdown​

Performance Characteristics​

Configuration for AZ1.AI​

Recommended Settings​

Why These Settings?​

Next Steps for Integration​

Phase 1: IPC Integration (MVP)​

Phase 2: Event Enrichment​

Phase 3: FFI Module (Performance)​

Phase 4: Cloud Sync​

Summary​