File-Monitor Purpose and Integration
What is the File-Monitor?
The file-monitor is a production-grade Rust library that provides real-time file system monitoring with enterprise features like rate limiting, debouncing, checksums, and observability.
Core Purpose
Track ALL file changes in the workspace and provide detailed audit trails for:
The 5 W's + H
-
WHO changed it?
- User ID
- Process name
- Agent ID (for automated changes)
-
WHAT changed?
- File path
- Event type (create, modify, delete, rename, move)
- File size
- SHA-256 checksum (for integrity verification)
-
WHEN did it change?
- UTC timestamp
- ISO 8601 format
-
WHERE did it happen?
- Absolute file path
- workspace context
-
WHY did it change?
- Linked to user action (commit message)
- Linked to agent task (task description)
- Parent event tracking
-
HOW did it change?
- Before/after states (via diffs)
- Modification type (content, metadata, permissions)
Integration with AZ1.AI IDE
Architecture Context
From ADR-022: Audit Logging Architecture, the file-monitor is part of a comprehensive audit system:
┌────────────────────────────────────────────────────────────────┐
│ Application Layer │
│ (theia IDE, Agent System, User Actions) │
└────────────────────┬───────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────┐
│ AuditLogService (TypeScript) │
│ • logFileChange() │
│ • logUserAction() │
│ • logEvent() │
└────────────────────┬───────────────────────────────────────────┘
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Diff │ │Foundation │ │ Cloud │
│ Generator │ │ DB │ │ Logging │
└────────────┘ └────────────┘ └────────────┘
▲
│
┌────────────────────┴───────────────────────────────────────────┐
│ File Change Monitors (File-Monitor Layer) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ file-monitor │ │ chokidar │ │ OPFS │ │
│ │ (Rust) │ │ (TypeScript) │ │ Watcher │ │
│ │ │ │ │ │ (Browser) │ │
│ │ • inotify │ │ • Fallback │ │ • OPFS API │ │
│ │ • FSEvents │ │ • Cross-FS │ │ • IndexedDB │ │
│ │ • Checksums │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└────────────────────────────────────────────────────────────────┘
Why Rust Instead of Pure TypeScript?
The file-monitor is written in Rust (not TypeScript/Node.js) because:
- Performance: Native OS APIs (inotify, FSEvents, ReadDirectoryChangesW) via
notifycrate - Resource Efficiency: Bounded memory usage, no garbage collection pauses
- Checksums: Streaming SHA-256 calculation without loading entire files into memory
- Reliability: Type safety, no runtime errors, production-grade error handling
- Rate Limiting: Semaphore-based backpressure to prevent event floods
- Observability: Built-in Prometheus metrics for monitoring
Integration Strategy
Option 1: IPC (Recommended for MVP)
// TypeScript (Node.js) spawns Rust binary as subprocess
import { spawn } from 'child_process';
const monitor = spawn('./file-monitor', ['/workspace'], {
stdio: ['pipe', 'pipe', 'pipe']
});
monitor.stdout.on('data', (data) => {
const event = JSON.parse(data.toString());
auditLogService.logFileChange(event);
});
Option 2: FFI (Better Performance)
// Use napi-rs to create Node.js native module
import { FileMonitor } from '@az1ai/file-monitor';
const monitor = new FileMonitor('/workspace', {
recursive: true,
debounce: 500,
checksums: true
});
monitor.on('event', (event) => {
auditLogService.logFileChange(event);
});
Option 3: WebAssembly (Browser-Only)
- Compile to WASM for browser environments
- Limited file system access (requires OPFS/File System Access API)
- Best for OPFS watching inside browser
Use Cases in AZ1.AI
1. Agent File Operations Tracking
Scenario: Code Generation Agent creates new files
// Agent creates file
await fs.writeFile('/workspace/src/newFeature.ts', code);
// File-monitor detects change
// Event emitted:
{
event_type: 'Created',
file_path: '/workspace/src/newFeature.ts',
user_id: null,
process_name: 'node',
checksum: 'sha256:abc123...',
file_size: 2048,
timestamp_utc: '2025-10-06T12:34:56Z'
}
// AuditLogService records:
{
event_id: 'uuid-123',
event_type: 'FILE_CREATE',
actor_type: 'agent',
actor_id: 'code-generation-agent',
resource_path: '/workspace/src/newFeature.ts',
reason: 'Implementing user story #42: Add login feature',
metadata: {
agent_task_id: 'task-456',
llm_model: 'qwen/qwq-32b',
checksum: 'sha256:abc123...'
}
}
2. User File Edits
Scenario: User manually edits file in theia editor
// File-monitor detects modification
{
event_type: 'Modified',
modification_type: 'Content',
file_path: '/workspace/src/auth.ts',
user_id: 'user-789',
checksum: 'sha256:def456...',
timestamp_utc: '2025-10-06T12:35:01Z'
}
// AuditLogService generates diff and records
{
event_type: 'FILE_MODIFY',
actor_type: 'user',
actor_id: 'user-789',
before_state: { checksum: 'sha256:abc123...' },
after_state: { checksum: 'sha256:def456...' },
diff: '@@ -10,7 +10,7 @@\n-const secret = "old"\n+const secret = process.env.SECRET\n',
reason: 'Fixed hardcoded secret vulnerability'
}
3. Multi-Agent Collaboration
Scenario: Code Review Agent detects security issue from Code Gen Agent
// Code Gen Agent creates file (event 1)
// File-monitor emits create event
// Code Review Agent reviews file (triggered by event 1)
// Finds security issue
// Code Review Agent modifies file (event 2)
// File-monitor emits modify event
// AuditLogService links events
{
event_id: 'event-2',
parent_event_id: 'event-1', // Links to creation
event_type: 'FILE_MODIFY',
actor_type: 'agent',
actor_id: 'code-review-agent',
reason: 'Security fix: Removed hardcoded API key',
metadata: {
triggered_by: 'code-generation-agent',
security_issue: 'CWE-798: Hardcoded Credentials'
}
}
4. Compliance & Audit Trail
Scenario: GDPR audit requires proof of data handling
// Query audit logs for all file operations on user data
const auditTrail = await auditLogService.query({
resource_path: '/workspace/src/users/*.ts',
start_time: '2025-01-01',
end_time: '2025-12-31'
});
// Returns complete chain of custody:
// - When: Every timestamp
// - Who: Every user/agent that touched the files
// - What: Every change (with diffs)
// - Why: Every reason (commit messages, task descriptions)
// - How: Every checksum verifying integrity
Key Features for AZ1.AI
1. Rate Limiting
Prevents event floods when agent makes bulk changes:
- Semaphore-based concurrency control
- Configurable max concurrent tasks
- Backpressure handling (drop vs. wait)
2. Debouncing
Reduces duplicate events when editor auto-saves:
- Time-window deduplication (500ms default)
- 70-90% event reduction
- Configurable per-file or globally
3. Checksums
Verify file integrity and detect tampering:
- Streaming SHA-256 (constant memory)
- Configurable size limit (100MB default)
- Used for deduplication and audit trails
4. Observability
Monitor system health and performance:
- Prometheus metrics (events/sec, latency, dropped events)
- Structured tracing (debug, info, warn, error)
- Health checks for alerting
5. Graceful Shutdown
Ensure no events are lost during restarts:
- Coordinated lifecycle with drain timeout
- Complete in-flight events before shutdown
- Signal handling (SIGTERM, SIGINT)
Performance Characteristics
From production benchmarks (docs/file-monitor/production.md):
| Scenario | Events/sec | CPU | Memory | Latency (p99) |
|---|---|---|---|---|
| Idle monitoring | 0 | <1% | 5 MB | N/A |
| Light (10 evt/s) | 10 | 2% | 10 MB | 2ms |
| Moderate (100 evt/s) | 100 | 8% | 25 MB | 5ms |
| Heavy (1000 evt/s) | 1000 | 25% | 60 MB | 15ms |
| With checksums (100 evt/s) | 100 | 15% | 30 MB | 25ms |
Key Insight: Can handle 1000+ events/sec (enough for bulk agent operations) with <100ms latency.
Configuration for AZ1.AI
Recommended Settings
let config = MonitorConfig::new("/workspace")
.recursive(true) // Watch all subdirectories
.debounce(500) // 500ms debounce window
.concurrency(100, 1000) // Max 100 concurrent, buffer 1000
.with_checksums(Some(50 * 1024 * 1024)) // 50MB checksum limit
.ignore_patterns(vec![
"*.tmp".to_string(),
"*.swp".to_string(),
".git/*".to_string(),
"node_modules/*".to_string(),
"__pycache__/*".to_string(),
".DS_Store".to_string(),
"*.log".to_string(),
]);
Why These Settings?
- recursive: true - Monitor entire workspace including agent-created subdirectories
- debounce: 500ms - Balance responsiveness with deduplication (editor auto-save ~300ms)
- concurrency: 100/1000 - Handle agent bulk operations (create 50 files at once)
- checksums: 50MB limit - Calculate for source files, skip for large binaries/videos
- ignore_patterns - Skip temp files, version control, dependencies
Next Steps for Integration
Phase 1: IPC Integration (MVP)
- ✅ Build Rust binary (DONE)
- Create TypeScript wrapper service
- Spawn file-monitor as subprocess
- Parse JSON events from stdout
- Forward to AuditLogService
Phase 2: Event Enrichment
- Link file events to user sessions
- Link to agent tasks
- Generate diffs (before/after states)
- Store in FoundationDB
Phase 3: FFI Module (Performance)
- Create napi-rs bindings
- Compile as Node.js native module
- Direct function calls (no IPC overhead)
- Shared memory for high throughput
Phase 4: Cloud Sync
- Replicate audit logs to cloud storage
- Enable cross-device audit trails
- Compliance reporting dashboards
Summary
File-Monitor provides the foundation for comprehensive audit logging in AZ1.AI:
- ✅ Tracks WHO (user/agent attribution)
- ✅ Tracks WHAT (file paths, event types, checksums)
- ✅ Tracks WHEN (timestamps)
- ✅ Tracks WHERE (workspace paths)
- ⏳ Needs WHY (integration with task system)
- ⏳ Needs HOW (diff generation)
The Rust implementation provides production-grade reliability and performance that TypeScript alone cannot match, especially for high-frequency file operations from multi-agent systems.
Related Documents:
- ADR-022: Audit Logging Architecture
- ADR-023: File Change Tracking
- ADR-018: Local Filesystem Integration
- docs/file-monitor/README.md - Usage guide
- docs/file-monitor/production.md - Deployment guide