ADR-022: Audit Logging Architecture
Status: Accepted Date: 2025-10-06 Deciders: Development Team, Security Team, Compliance Team Related: ADR-021 (User Management), ADR-004 (FoundationDB), ADR-018 (Filesystem)
Context
The AZ1.AI llm IDE requires comprehensive audit logging to track:
- File Changes: Who changed what, when, why, and how
- User Actions: Logins, llm queries, agent executions
- System Events: Errors, performance metrics, security events
- Compliance: GDPR, SOC 2, audit trails
Key Requirements from User
"We need user management and logging in the cloud with foundationdb and local file monitoring of changes updates, creations, deletes movements, and modifications so we know what is changing when it changed why it changed, how it changed and who or what changed it."
Current State
- No audit logging system
- No file change tracking
- No user action logging
- No compliance features
Requirements
- Complete Audit Trail: Every action logged with context
- File Change Tracking: Detect all file operations with diffs
- User Attribution: Link every change to user or agent
- Temporal Tracking: When did it happen (timestamp)
- Reason Tracking: Why it happened (commit message, task description)
- Change Details: How it changed (before/after states, diffs)
- Immutability: Audit logs cannot be modified or deleted
- Performance: Minimal overhead (< 10ms per log entry)
- Compliance: GDPR, SOC 2, audit requirements
Decision
We will implement a comprehensive audit logging system using FoundationDB with structured event logging:
Architecture
┌────────────────────────────────────────────────────────────────┐
│ Application Layer │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ File Operations │ User Actions │ System Events │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ AuditLogService │ │
│ │ ┌────────────────────────────────────────────────────┐ │ │
│ │ │ logFileChange() │ logUserAction() │ logEvent()│ │ │
│ │ └────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ Diff │ │FoundationDB │ │ Cloud │ │
│ │Generator │ │ (Audit Logs) │ │ Logging │ │
│ └──────────┘ └──────────────┘ └────────────┘ │
└────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────┐
│ File Change Monitors │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ chokidar │ │ OPFS │ │ Cloud │ │
│ │ (Local FS) │ │ Watcher │ │ Storage │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└────────────────────────────────────────────────────────────────┘
Audit Log Schema
Core Audit Event:
interface AuditEvent {
// Identity
event_id: string; // UUID
event_type: AuditEventType; // 'file:create', 'file:modify', 'user:login', etc.
entity_type: EntityType; // 'file', 'user', 'session', 'agent', etc.
entity_id: string; // ID of the entity being acted upon
// Attribution (WHO)
actor_type: 'user' | 'agent' | 'system';
actor_id: string; // user_id or agent_id
actor_name: string; // Display name
session_id?: string; // Session context
// Temporal (WHEN)
timestamp: number; // Unix timestamp (ms)
timestamp_iso: string; // ISO 8601 format
// Location (WHERE)
resource_path?: string; // File path, API endpoint, etc.
ip_address?: string; // User's IP address
user_agent?: string; // Browser/client info
// Context (WHY)
reason?: string; // Commit message, task description, user note
parent_event_id?: string; // Link to triggering event
trace_id?: string; // Distributed tracing ID
// Change Details (WHAT & HOW)
before_state?: any; // State before change (JSON)
after_state?: any; // State after change (JSON)
diff?: string; // Unified diff format
metadata: Record<string, any>; // Additional context
// Immutability
checksum: string; // SHA-256 of event data
previous_checksum?: string; // Chain events together
}
enum AuditEventType {
// File events
FILE_CREATE = 'file:create',
FILE_MODIFY = 'file:modify',
FILE_DELETE = 'file:delete',
FILE_MOVE = 'file:move',
FILE_RENAME = 'file:rename',
FILE_PERMISSION = 'file:permission',
// User events
USER_LOGIN = 'user:login',
USER_LOGOUT = 'user:logout',
USER_REGISTER = 'user:register',
USER_UPDATE = 'user:update',
USER_DELETE = 'user:delete',
// llm events
llm_QUERY = 'llm:query',
llm_RESPONSE = 'llm:response',
llm_ERROR = 'llm:error',
// Agent events
AGENT_START = 'agent:start',
AGENT_COMPLETE = 'agent:complete',
AGENT_ERROR = 'agent:error',
AGENT_COLLABORATE = 'agent:collaborate',
// Session events
SESSION_CREATE = 'session:create',
SESSION_SWITCH = 'session:switch',
SESSION_DELETE = 'session:delete',
// System events
SYSTEM_ERROR = 'system:error',
SYSTEM_SECURITY = 'system:security',
SYSTEM_PERFORMANCE = 'system:performance',
// Access control
ACCESS_GRANTED = 'access:granted',
ACCESS_DENIED = 'access:denied',
}
enum EntityType {
FILE = 'file',
USER = 'user',
SESSION = 'session',
AGENT = 'agent',
llm = 'llm',
WORKSPACE = 'workspace',
API_KEY = 'api_key',
SYSTEM = 'system',
}
Optimized Schemas for Specific Use Cases:
// File-specific audit log (optimized for file history queries)
interface FileAuditEvent extends AuditEvent {
file_path: string;
file_size_bytes?: number;
file_mime_type?: string;
diff_lines_added?: number;
diff_lines_removed?: number;
diff_hunks?: number;
}
// User action audit log (optimized for user activity queries)
interface UserAuditEvent extends AuditEvent {
user_id: string;
action: string;
resource_accessed?: string;
success: boolean;
error_message?: string;
}
// llm usage audit log (optimized for cost tracking)
interface llmUsageEvent extends AuditEvent {
llm_provider: string;
model_id: string;
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
cost_usd: number;
latency_ms: number;
}
Implementation
1. Audit Log Service
// src/backend/services/audit-log-service.ts
import { createHash } from 'crypto';
import { FDBService } from './fdb-service';
import { diff as diffLines } from 'diff';
export class AuditLogService {
private fdb: FDBService;
private lastChecksum: string | null = null;
constructor(fdb: FDBService) {
this.fdb = fdb;
}
async logFileChange(params: {
eventType: 'create' | 'modify' | 'delete' | 'move' | 'rename';
filePath: string;
actorId: string;
actorType: 'user' | 'agent';
sessionId?: string;
beforeContent?: string;
afterContent?: string;
reason?: string;
metadata?: Record<string, any>;
}): Promise<string> {
const event_id = this.generateUUID();
// Generate diff if modification
let diff: string | undefined;
let diffStats: { added: number; removed: number } | undefined;
if (params.eventType === 'modify' && params.beforeContent && params.afterContent) {
const diffResult = diffLines(params.beforeContent, params.afterContent);
diff = this.formatDiff(diffResult);
diffStats = this.calculateDiffStats(diffResult);
}
const event: FileAuditEvent = {
event_id,
event_type: `file:${params.eventType}` as AuditEventType,
entity_type: EntityType.FILE,
entity_id: params.filePath,
actor_type: params.actorType,
actor_id: params.actorId,
actor_name: await this.getActorName(params.actorId, params.actorType),
session_id: params.sessionId,
timestamp: Date.now(),
timestamp_iso: new Date().toISOString(),
resource_path: params.filePath,
file_path: params.filePath,
reason: params.reason,
before_state: params.beforeContent ? { content: params.beforeContent } : undefined,
after_state: params.afterContent ? { content: params.afterContent } : undefined,
diff,
diff_lines_added: diffStats?.added,
diff_lines_removed: diffStats?.removed,
metadata: params.metadata || {},
checksum: '',
previous_checksum: this.lastChecksum || undefined,
};
// Calculate checksum
event.checksum = this.calculateChecksum(event);
this.lastChecksum = event.checksum;
// Save to FoundationDB
await this.saveEvent(event);
return event_id;
}
async logUserAction(params: {
action: string;
userId: string;
resource?: string;
success: boolean;
errorMessage?: string;
ipAddress?: string;
userAgent?: string;
metadata?: Record<string, any>;
}): Promise<string> {
const event_id = this.generateUUID();
const event: UserAuditEvent = {
event_id,
event_type: `user:${params.action}` as AuditEventType,
entity_type: EntityType.USER,
entity_id: params.userId,
actor_type: 'user',
actor_id: params.userId,
actor_name: await this.getActorName(params.userId, 'user'),
user_id: params.userId,
action: params.action,
resource_accessed: params.resource,
success: params.success,
error_message: params.errorMessage,
timestamp: Date.now(),
timestamp_iso: new Date().toISOString(),
ip_address: params.ipAddress,
user_agent: params.userAgent,
metadata: params.metadata || {},
checksum: '',
previous_checksum: this.lastChecksum || undefined,
};
event.checksum = this.calculateChecksum(event);
this.lastChecksum = event.checksum;
await this.saveEvent(event);
return event_id;
}
async logllmUsage(params: {
provider: string;
modelId: string;
userId: string;
sessionId?: string;
promptTokens: number;
completionTokens: number;
costUsd: number;
latencyMs: number;
success: boolean;
errorMessage?: string;
}): Promise<string> {
const event_id = this.generateUUID();
const event: llmUsageEvent = {
event_id,
event_type: AuditEventType.llm_QUERY,
entity_type: EntityType.llm,
entity_id: params.modelId,
actor_type: 'user',
actor_id: params.userId,
actor_name: await this.getActorName(params.userId, 'user'),
session_id: params.sessionId,
llm_provider: params.provider,
model_id: params.modelId,
prompt_tokens: params.promptTokens,
completion_tokens: params.completionTokens,
total_tokens: params.promptTokens + params.completionTokens,
cost_usd: params.costUsd,
latency_ms: params.latencyMs,
timestamp: Date.now(),
timestamp_iso: new Date().toISOString(),
metadata: {
success: params.success,
error_message: params.errorMessage,
},
checksum: '',
previous_checksum: this.lastChecksum || undefined,
};
event.checksum = this.calculateChecksum(event);
this.lastChecksum = event.checksum;
await this.saveEvent(event);
return event_id;
}
async logAgentAction(params: {
action: 'start' | 'complete' | 'error' | 'collaborate';
agentId: string;
userId: string;
sessionId?: string;
taskDescription?: string;
result?: any;
errorMessage?: string;
collaborationTarget?: string;
}): Promise<string> {
const event_id = this.generateUUID();
const event: AuditEvent = {
event_id,
event_type: `agent:${params.action}` as AuditEventType,
entity_type: EntityType.AGENT,
entity_id: params.agentId,
actor_type: 'agent',
actor_id: params.agentId,
actor_name: params.agentId,
session_id: params.sessionId,
timestamp: Date.now(),
timestamp_iso: new Date().toISOString(),
metadata: {
user_id: params.userId,
task_description: params.taskDescription,
result: params.result,
error_message: params.errorMessage,
collaboration_target: params.collaborationTarget,
},
checksum: '',
previous_checksum: this.lastChecksum || undefined,
};
event.checksum = this.calculateChecksum(event);
this.lastChecksum = event.checksum;
await this.saveEvent(event);
return event_id;
}
async queryEvents(filters: {
eventTypes?: AuditEventType[];
entityTypes?: EntityType[];
actorId?: string;
startTime?: number;
endTime?: number;
entityId?: string;
limit?: number;
}): Promise<AuditEvent[]> {
// Query FoundationDB with filters
const allEvents = await this.fdb.scan('audit:event:');
let filtered = allEvents.filter((event: AuditEvent) => {
if (filters.eventTypes && !filters.eventTypes.includes(event.event_type)) {
return false;
}
if (filters.entityTypes && !filters.entityTypes.includes(event.entity_type)) {
return false;
}
if (filters.actorId && event.actor_id !== filters.actorId) {
return false;
}
if (filters.entityId && event.entity_id !== filters.entityId) {
return false;
}
if (filters.startTime && event.timestamp < filters.startTime) {
return false;
}
if (filters.endTime && event.timestamp > filters.endTime) {
return false;
}
return true;
});
// Sort by timestamp descending
filtered.sort((a, b) => b.timestamp - a.timestamp);
// Limit results
if (filters.limit) {
filtered = filtered.slice(0, filters.limit);
}
return filtered;
}
async getFileHistory(filePath: string, limit = 100): Promise<FileAuditEvent[]> {
const events = await this.queryEvents({
entityId: filePath,
entityTypes: [EntityType.FILE],
limit,
});
return events as FileAuditEvent[];
}
async getUserActivity(userId: string, limit = 100): Promise<AuditEvent[]> {
return this.queryEvents({
actorId: userId,
limit,
});
}
async exportAuditLogs(filters: {
startTime: number;
endTime: number;
format: 'json' | 'csv';
}): Promise<string> {
const events = await this.queryEvents({
startTime: filters.startTime,
endTime: filters.endTime,
});
if (filters.format === 'json') {
return JSON.stringify(events, null, 2);
} else {
return this.eventsToCSV(events);
}
}
private async saveEvent(event: AuditEvent): Promise<void> {
// Save to primary index (by event_id)
await this.fdb.set(`audit:event:${event.event_id}`, event);
// Save to secondary indexes for fast querying
await this.fdb.set(
`audit:by_actor:${event.actor_id}:${event.timestamp}:${event.event_id}`,
event.event_id
);
await this.fdb.set(
`audit:by_entity:${event.entity_type}:${event.entity_id}:${event.timestamp}:${event.event_id}`,
event.event_id
);
await this.fdb.set(
`audit:by_time:${event.timestamp}:${event.event_id}`,
event.event_id
);
await this.fdb.set(
`audit:by_type:${event.event_type}:${event.timestamp}:${event.event_id}`,
event.event_id
);
// Also send to Cloud Logging for long-term storage
await this.sendToCloudLogging(event);
}
private async sendToCloudLogging(event: AuditEvent): Promise<void> {
// Integration with GCP Cloud Logging
// console.log('[AUDIT]', event);
// In production: use @google-cloud/logging
}
private calculateChecksum(event: AuditEvent): string {
// Create deterministic string representation
const data = JSON.stringify({
event_id: event.event_id,
event_type: event.event_type,
actor_id: event.actor_id,
timestamp: event.timestamp,
entity_id: event.entity_id,
before_state: event.before_state,
after_state: event.after_state,
previous_checksum: event.previous_checksum,
});
return createHash('sha256').update(data).digest('hex');
}
private formatDiff(diffResult: any[]): string {
let diff = '';
for (const part of diffResult) {
const prefix = part.added ? '+' : part.removed ? '-' : ' ';
const lines = part.value.split('\n');
for (const line of lines) {
if (line) {
diff += `${prefix}${line}\n`;
}
}
}
return diff;
}
private calculateDiffStats(diffResult: any[]): { added: number; removed: number } {
let added = 0;
let removed = 0;
for (const part of diffResult) {
const lines = part.value.split('\n').filter((l: string) => l);
if (part.added) {
added += lines.length;
} else if (part.removed) {
removed += lines.length;
}
}
return { added, removed };
}
private eventsToCSV(events: AuditEvent[]): string {
const headers = [
'event_id',
'timestamp_iso',
'event_type',
'actor_id',
'actor_name',
'entity_type',
'entity_id',
'resource_path',
'reason',
].join(',');
const rows = events.map((event) =>
[
event.event_id,
event.timestamp_iso,
event.event_type,
event.actor_id,
event.actor_name,
event.entity_type,
event.entity_id,
event.resource_path || '',
event.reason || '',
].join(',')
);
return [headers, ...rows].join('\n');
}
private async getActorName(actorId: string, actorType: 'user' | 'agent'): Promise<string> {
if (actorType === 'user') {
const user = await this.fdb.get(`user:${actorId}`);
return user?.name || 'Unknown User';
} else {
return actorId; // Agent ID is the name
}
}
private generateUUID(): string {
return require('crypto').randomBytes(16).toString('hex');
}
}
2. File Change Monitor
// src/backend/services/file-change-monitor.ts
import chokidar from 'chokidar';
import { promises as fs } from 'fs';
import { AuditLogService } from './audit-log-service';
export class FileChangeMonitor {
private watcher: chokidar.FSWatcher | null = null;
private fileSnapshots = new Map<string, string>(); // path -> content
private auditLog: AuditLogService;
constructor(auditLog: AuditLogService) {
this.auditLog = auditLog;
}
async startMonitoring(workspacePath: string, userId: string, sessionId?: string): Promise<void> {
this.watcher = chokidar.watch(workspacePath, {
persistent: true,
ignoreInitial: false,
ignored: /(^|[\/\\])\../, // Ignore dotfiles
});
this.watcher
.on('add', async (path) => {
const content = await fs.readFile(path, 'utf-8');
this.fileSnapshots.set(path, content);
await this.auditLog.logFileChange({
eventType: 'create',
filePath: path,
actorId: userId,
actorType: 'user',
sessionId,
afterContent: content,
reason: 'File created',
});
})
.on('change', async (path) => {
const beforeContent = this.fileSnapshots.get(path);
const afterContent = await fs.readFile(path, 'utf-8');
this.fileSnapshots.set(path, afterContent);
await this.auditLog.logFileChange({
eventType: 'modify',
filePath: path,
actorId: userId,
actorType: 'user',
sessionId,
beforeContent,
afterContent,
reason: 'File modified',
});
})
.on('unlink', async (path) => {
const beforeContent = this.fileSnapshots.get(path);
this.fileSnapshots.delete(path);
await this.auditLog.logFileChange({
eventType: 'delete',
filePath: path,
actorId: userId,
actorType: 'user',
sessionId,
beforeContent,
reason: 'File deleted',
});
});
}
stopMonitoring(): void {
if (this.watcher) {
this.watcher.close();
this.watcher = null;
}
this.fileSnapshots.clear();
}
}
3. Integration with Existing Services
// Example: Integrate with llm Service
class llmService {
constructor(private auditLog: AuditLogService) {}
async chatCompletion(
userId: string,
model: string,
messages: Message[],
sessionId?: string
): Promise<string> {
const startTime = Date.now();
try {
// Call llm
const response = await this.callllmAPI(model, messages);
const latency = Date.now() - startTime;
// Log usage
await this.auditLog.logllmUsage({
provider: this.getProvider(model),
modelId: model,
userId,
sessionId,
promptTokens: this.countTokens(messages),
completionTokens: this.countTokens(response),
costUsd: this.calculateCost(model, messages, response),
latencyMs: latency,
success: true,
});
return response;
} catch (error: any) {
// Log error
await this.auditLog.logllmUsage({
provider: this.getProvider(model),
modelId: model,
userId,
sessionId,
promptTokens: this.countTokens(messages),
completionTokens: 0,
costUsd: 0,
latencyMs: Date.now() - startTime,
success: false,
errorMessage: error.message,
});
throw error;
}
}
}
Rationale
Why FoundationDB for Audit Logs?
Performance:
- ✅ Fast writes (< 10ms)
- ✅ Efficient range queries (time-series)
- ✅ Secondary indexes (actor, entity, type)
Immutability:
- ✅ Append-only design
- ✅ Checksum chain (tamper detection)
- ✅ Previous checksum linking
Scalability:
- ✅ Horizontal scaling
- ✅ Handle millions of events
- ✅ Efficient compaction
Why Checksum Chaining?
Tamper Detection:
- ✅ Any modification breaks chain
- ✅ Can detect deleted events
- ✅ Cryptographic guarantee (SHA-256)
Compliance:
- ✅ SOC 2 requirement (immutable logs)
- ✅ Audit trail integrity
- ✅ Legal defensibility
Why Structured Events?
Queryability:
- ✅ Fast filtering by type, actor, time
- ✅ Complex queries (who changed what when)
- ✅ Analytics and reporting
Flexibility:
- ✅ JSON metadata for extensibility
- ✅ Type-specific schemas
- ✅ Future-proof design
Alternatives Considered
Alternative 1: Elasticsearch
Pros:
- Full-text search
- Good for log analysis
- Kibana visualization
Cons:
- ❌ More complex to operate
- ❌ Higher resource usage
- ❌ Not immutable by design
Rejected: FDB is simpler and faster
Alternative 2: PostgreSQL
Pros:
- SQL queries
- Well-known
- Good tooling
Cons:
- ❌ Slower writes
- ❌ Less scalable
- ❌ No built-in immutability
Rejected: FDB is better for append-only logs
Alternative 3: Cloud Logging Only
Pros:
- Fully managed
- Integration with GCP
- No infrastructure
Cons:
- ❌ Vendor lock-in
- ❌ Higher cost at scale
- ❌ Less control over querying
Deferred: Use as backup, FDB as primary
Consequences
Positive
✅ Complete Audit Trail: Every action logged ✅ Immutable: Checksum chain prevents tampering ✅ Fast Queries: Optimized indexes for common queries ✅ Compliance: GDPR, SOC 2 ready ✅ Attribution: Clear who/what/when/why/how ✅ File History: Complete change tracking with diffs
Negative
❌ Storage Growth: Audit logs grow unbounded ❌ Performance Overhead: 5-10ms per operation ❌ Complexity: More code to maintain ❌ Query Complexity: Time-series queries can be slow
Mitigation
Storage Growth:
- Archive old logs to Cloud Storage (> 90 days)
- Compress archived logs
- Retention policy (delete after 7 years)
Performance Overhead:
- Async logging (don't block operations)
- Batch inserts for high-volume events
- Cache recent events in memory
Complexity:
- Helper methods for common log types
- Integration in base classes
- Automated testing
Query Complexity:
- Pre-compute common queries
- Use materialized views
- Optimize indexes
Implementation Plan
Phase 1: Core Audit Service ✅
- AuditLogService implementation
- Event schema design
- Checksum calculation
- Basic event types
Phase 2: File Change Monitoring 🔲
- chokidar integration
- Diff generation
- File snapshot system
- OPFS watcher
Phase 3: Service Integration 🔲
- llm service integration
- User service integration
- Agent system integration
- Session service integration
Phase 4: Querying & UI 🔲
- Query API endpoints
- Audit log viewer widget
- File history viewer
- User activity dashboard
Phase 5: Compliance 🔲
- Retention policy
- Archival to Cloud Storage
- GDPR export
- SOC 2 reporting
Success Metrics
Performance:
- < 10ms log write latency
- < 100ms query latency (recent events)
- 10M+ events supported
Coverage:
- 100% of file operations logged
- 100% of user actions logged
- 100% of llm queries logged
Compliance:
- Immutability verification (checksum chain)
- GDPR-compliant data export
- SOC 2 audit trail
Related Decisions
- ADR-021: User Management - User attribution
- ADR-018: Filesystem - File change tracking
- ADR-004: FoundationDB - Storage backend
References
Audit Logging Best Practices:
Diff Generation:
File Watching:
Status: ✅ Accepted Next Review: 2025-11-06 (1 month) Last Updated: 2025-10-06