Skip to main content

File-Monitor Purpose and Integration

What is the File-Monitor?

The file-monitor is a production-grade Rust library that provides real-time file system monitoring with enterprise features like rate limiting, debouncing, checksums, and observability.

Core Purpose

Track ALL file changes in the workspace and provide detailed audit trails for:

The 5 W's + H

  1. WHO changed it?

    • User ID
    • Process name
    • Agent ID (for automated changes)
  2. WHAT changed?

    • File path
    • Event type (create, modify, delete, rename, move)
    • File size
    • SHA-256 checksum (for integrity verification)
  3. WHEN did it change?

    • UTC timestamp
    • ISO 8601 format
  4. WHERE did it happen?

    • Absolute file path
    • workspace context
  5. WHY did it change?

    • Linked to user action (commit message)
    • Linked to agent task (task description)
    • Parent event tracking
  6. HOW did it change?

    • Before/after states (via diffs)
    • Modification type (content, metadata, permissions)

Integration with AZ1.AI IDE

Architecture Context

From ADR-022: Audit Logging Architecture, the file-monitor is part of a comprehensive audit system:

┌────────────────────────────────────────────────────────────────┐
│ Application Layer │
│ (theia IDE, Agent System, User Actions) │
└────────────────────┬───────────────────────────────────────────┘


┌────────────────────────────────────────────────────────────────┐
│ AuditLogService (TypeScript) │
│ • logFileChange() │
│ • logUserAction() │
│ • logEvent() │
└────────────────────┬───────────────────────────────────────────┘

┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Diff │ │Foundation │ │ Cloud │
│ Generator │ │ DB │ │ Logging │
└────────────┘ └────────────┘ └────────────┘


┌────────────────────┴───────────────────────────────────────────┐
│ File Change Monitors (File-Monitor Layer) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ file-monitor │ │ chokidar │ │ OPFS │ │
│ │ (Rust) │ │ (TypeScript) │ │ Watcher │ │
│ │ │ │ │ │ (Browser) │ │
│ │ • inotify │ │ • Fallback │ │ • OPFS API │ │
│ │ • FSEvents │ │ • Cross-FS │ │ • IndexedDB │ │
│ │ • Checksums │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└────────────────────────────────────────────────────────────────┘

Why Rust Instead of Pure TypeScript?

The file-monitor is written in Rust (not TypeScript/Node.js) because:

  1. Performance: Native OS APIs (inotify, FSEvents, ReadDirectoryChangesW) via notify crate
  2. Resource Efficiency: Bounded memory usage, no garbage collection pauses
  3. Checksums: Streaming SHA-256 calculation without loading entire files into memory
  4. Reliability: Type safety, no runtime errors, production-grade error handling
  5. Rate Limiting: Semaphore-based backpressure to prevent event floods
  6. Observability: Built-in Prometheus metrics for monitoring

Integration Strategy

Option 1: IPC (Recommended for MVP)

// TypeScript (Node.js) spawns Rust binary as subprocess
import { spawn } from 'child_process';

const monitor = spawn('./file-monitor', ['/workspace'], {
stdio: ['pipe', 'pipe', 'pipe']
});

monitor.stdout.on('data', (data) => {
const event = JSON.parse(data.toString());
auditLogService.logFileChange(event);
});

Option 2: FFI (Better Performance)

// Use napi-rs to create Node.js native module
import { FileMonitor } from '@az1ai/file-monitor';

const monitor = new FileMonitor('/workspace', {
recursive: true,
debounce: 500,
checksums: true
});

monitor.on('event', (event) => {
auditLogService.logFileChange(event);
});

Option 3: WebAssembly (Browser-Only)

  • Compile to WASM for browser environments
  • Limited file system access (requires OPFS/File System Access API)
  • Best for OPFS watching inside browser

Use Cases in AZ1.AI

1. Agent File Operations Tracking

Scenario: Code Generation Agent creates new files

// Agent creates file
await fs.writeFile('/workspace/src/newFeature.ts', code);

// File-monitor detects change
// Event emitted:
{
event_type: 'Created',
file_path: '/workspace/src/newFeature.ts',
user_id: null,
process_name: 'node',
checksum: 'sha256:abc123...',
file_size: 2048,
timestamp_utc: '2025-10-06T12:34:56Z'
}

// AuditLogService records:
{
event_id: 'uuid-123',
event_type: 'FILE_CREATE',
actor_type: 'agent',
actor_id: 'code-generation-agent',
resource_path: '/workspace/src/newFeature.ts',
reason: 'Implementing user story #42: Add login feature',
metadata: {
agent_task_id: 'task-456',
llm_model: 'qwen/qwq-32b',
checksum: 'sha256:abc123...'
}
}

2. User File Edits

Scenario: User manually edits file in theia editor

// File-monitor detects modification
{
event_type: 'Modified',
modification_type: 'Content',
file_path: '/workspace/src/auth.ts',
user_id: 'user-789',
checksum: 'sha256:def456...',
timestamp_utc: '2025-10-06T12:35:01Z'
}

// AuditLogService generates diff and records
{
event_type: 'FILE_MODIFY',
actor_type: 'user',
actor_id: 'user-789',
before_state: { checksum: 'sha256:abc123...' },
after_state: { checksum: 'sha256:def456...' },
diff: '@@ -10,7 +10,7 @@\n-const secret = "old"\n+const secret = process.env.SECRET\n',
reason: 'Fixed hardcoded secret vulnerability'
}

3. Multi-Agent Collaboration

Scenario: Code Review Agent detects security issue from Code Gen Agent

// Code Gen Agent creates file (event 1)
// File-monitor emits create event

// Code Review Agent reviews file (triggered by event 1)
// Finds security issue

// Code Review Agent modifies file (event 2)
// File-monitor emits modify event

// AuditLogService links events
{
event_id: 'event-2',
parent_event_id: 'event-1', // Links to creation
event_type: 'FILE_MODIFY',
actor_type: 'agent',
actor_id: 'code-review-agent',
reason: 'Security fix: Removed hardcoded API key',
metadata: {
triggered_by: 'code-generation-agent',
security_issue: 'CWE-798: Hardcoded Credentials'
}
}

4. Compliance & Audit Trail

Scenario: GDPR audit requires proof of data handling

// Query audit logs for all file operations on user data
const auditTrail = await auditLogService.query({
resource_path: '/workspace/src/users/*.ts',
start_time: '2025-01-01',
end_time: '2025-12-31'
});

// Returns complete chain of custody:
// - When: Every timestamp
// - Who: Every user/agent that touched the files
// - What: Every change (with diffs)
// - Why: Every reason (commit messages, task descriptions)
// - How: Every checksum verifying integrity

Key Features for AZ1.AI

1. Rate Limiting

Prevents event floods when agent makes bulk changes:

  • Semaphore-based concurrency control
  • Configurable max concurrent tasks
  • Backpressure handling (drop vs. wait)

2. Debouncing

Reduces duplicate events when editor auto-saves:

  • Time-window deduplication (500ms default)
  • 70-90% event reduction
  • Configurable per-file or globally

3. Checksums

Verify file integrity and detect tampering:

  • Streaming SHA-256 (constant memory)
  • Configurable size limit (100MB default)
  • Used for deduplication and audit trails

4. Observability

Monitor system health and performance:

  • Prometheus metrics (events/sec, latency, dropped events)
  • Structured tracing (debug, info, warn, error)
  • Health checks for alerting

5. Graceful Shutdown

Ensure no events are lost during restarts:

  • Coordinated lifecycle with drain timeout
  • Complete in-flight events before shutdown
  • Signal handling (SIGTERM, SIGINT)

Performance Characteristics

From production benchmarks (docs/file-monitor/production.md):

ScenarioEvents/secCPUMemoryLatency (p99)
Idle monitoring0<1%5 MBN/A
Light (10 evt/s)102%10 MB2ms
Moderate (100 evt/s)1008%25 MB5ms
Heavy (1000 evt/s)100025%60 MB15ms
With checksums (100 evt/s)10015%30 MB25ms

Key Insight: Can handle 1000+ events/sec (enough for bulk agent operations) with <100ms latency.

Configuration for AZ1.AI

let config = MonitorConfig::new("/workspace")
.recursive(true) // Watch all subdirectories
.debounce(500) // 500ms debounce window
.concurrency(100, 1000) // Max 100 concurrent, buffer 1000
.with_checksums(Some(50 * 1024 * 1024)) // 50MB checksum limit
.ignore_patterns(vec![
"*.tmp".to_string(),
"*.swp".to_string(),
".git/*".to_string(),
"node_modules/*".to_string(),
"__pycache__/*".to_string(),
".DS_Store".to_string(),
"*.log".to_string(),
]);

Why These Settings?

  • recursive: true - Monitor entire workspace including agent-created subdirectories
  • debounce: 500ms - Balance responsiveness with deduplication (editor auto-save ~300ms)
  • concurrency: 100/1000 - Handle agent bulk operations (create 50 files at once)
  • checksums: 50MB limit - Calculate for source files, skip for large binaries/videos
  • ignore_patterns - Skip temp files, version control, dependencies

Next Steps for Integration

Phase 1: IPC Integration (MVP)

  1. ✅ Build Rust binary (DONE)
  2. Create TypeScript wrapper service
  3. Spawn file-monitor as subprocess
  4. Parse JSON events from stdout
  5. Forward to AuditLogService

Phase 2: Event Enrichment

  1. Link file events to user sessions
  2. Link to agent tasks
  3. Generate diffs (before/after states)
  4. Store in FoundationDB

Phase 3: FFI Module (Performance)

  1. Create napi-rs bindings
  2. Compile as Node.js native module
  3. Direct function calls (no IPC overhead)
  4. Shared memory for high throughput

Phase 4: Cloud Sync

  1. Replicate audit logs to cloud storage
  2. Enable cross-device audit trails
  3. Compliance reporting dashboards

Summary

File-Monitor provides the foundation for comprehensive audit logging in AZ1.AI:

  • ✅ Tracks WHO (user/agent attribution)
  • ✅ Tracks WHAT (file paths, event types, checksums)
  • ✅ Tracks WHEN (timestamps)
  • ✅ Tracks WHERE (workspace paths)
  • ⏳ Needs WHY (integration with task system)
  • ⏳ Needs HOW (diff generation)

The Rust implementation provides production-grade reliability and performance that TypeScript alone cannot match, especially for high-frequency file operations from multi-agent systems.


Related Documents:

  • ADR-022: Audit Logging Architecture
  • ADR-023: File Change Tracking
  • ADR-018: Local Filesystem Integration
  • docs/file-monitor/README.md - Usage guide
  • docs/file-monitor/production.md - Deployment guide