ADR-003-v4: Multi-Tenant Architecture - Part 2 (Technical)

Document Specification Block

Document: ADR-003-v4-multi-tenant-architecture-part2-technical
Version: 1.0.0
Purpose: Constrain AI implementation with exact multi-tenant architecture specifications
Audience: AI agents, developers implementing the system
Date Created: 2025-08-30
Date Modified: 2025-08-30
Status: DRAFT

1. Technical Requirements

1.1 Constraints

MUST use FoundationDB key prefixing for tenant isolation
MUST implement tenant validation in all data operations
MUST support unlimited tenant scaling
MUST prevent cross-tenant data access at database level
MUST maintain consistent tenant context throughout request lifecycle

1.2 Dependencies

FoundationDB 7.1+ cluster
JWT authentication system
Actix-web HTTP framework
serde for serialization

2. Component Architecture

2.1 System Components

2.2 Data Flow

3. Implementation Specifications

3.1 API Endpoints

/api/v1/{resource}:
  all_methods:
    headers:
      Authorization: "Bearer {jwt_token}"
    context:
      tenant_id: extracted_from_jwt
    validation:
      - tenant_id present in JWT
      - tenant_id matches resource access
      - user authorized for tenant

3.2 Data Models

use uuid::Uuid;
use serde::{Deserialize, Serialize};
use chrono::{DateTime, Utc};

#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct TenantContext {
    pub tenant_id: Uuid,
    pub user_id: Uuid,
    pub permissions: Vec<String>,
}

impl TenantContext {
    pub fn new(tenant_id: Uuid, user_id: Uuid) -> Self {
        Self {
            tenant_id,
            user_id,
            permissions: vec![],
        }
    }
    
    pub fn validate(&self) -> Result<(), TenantError> {
        if self.tenant_id.is_nil() {
            return Err(TenantError::InvalidTenantId);
        }
        Ok(())
    }
}

#[derive(thiserror::Error, Debug)]
pub enum TenantError {
    #[error("Invalid tenant ID")]
    InvalidTenantId,
    #[error("Tenant not found: {0}")]
    NotFound(Uuid),
    #[error("Access denied for tenant: {0}")]
    AccessDenied(Uuid),
}

3.3 Database Schema

FoundationDB Key Patterns:
/{tenant_id}/users/{user_id} → UserModel
/{tenant_id}/projects/{project_id} → ProjectModel
/{tenant_id}/tasks/{task_id} → TaskModel
/{tenant_id}/agents/{agent_id} → AgentModel
/{tenant_id}/audit_events/{timestamp}:{uuid} → AuditModel

Global Keys (No Tenant Prefix):
/global/users/{user_id} → UserModel (cross-tenant identity)
/global/tenants/{tenant_id} → TenantModel
/global/cost_models/{provider}/{resource} → CostModel

4. Core Implementation

4.1 Service Implementation

use actix_web::{web, HttpResponse, Result};
use foundationdb::Database;
use uuid::Uuid;

pub struct MultiTenantService {
    db: Database,
}

impl MultiTenantService {
    pub fn new(db: Database) -> Self {
        Self { db }
    }
    
    // Core tenant isolation method
    pub async fn execute_with_tenant<T, F, Fut>(
        &self,
        tenant_context: &TenantContext,
        operation: F,
    ) -> Result<T, TenantError>
    where
        F: FnOnce(&TenantContext, &Database) -> Fut,
        Fut: std::future::Future<Output = Result<T, TenantError>>,
    {
        // Validate tenant context
        tenant_context.validate()?;
        
        // Verify tenant exists and user has access
        self.validate_tenant_access(tenant_context).await?;
        
        // Execute operation with isolated context
        operation(tenant_context, &self.db).await
    }
    
    async fn validate_tenant_access(
        &self,
        context: &TenantContext,
    ) -> Result<(), TenantError> {
        let txn = self.db.create_trx()?;
        
        // Check tenant exists
        let tenant_key = format!("/global/tenants/{}", context.tenant_id);
        let tenant_exists = txn.get(&tenant_key, false).await?.is_some();
        
        if !tenant_exists {
            return Err(TenantError::NotFound(context.tenant_id));
        }
        
        // Check user-tenant association
        let assoc_key = format!(
            "/{}/user_associations/{}", 
            context.tenant_id, 
            context.user_id
        );
        let has_access = txn.get(&assoc_key, false).await?.is_some();
        
        if !has_access {
            return Err(TenantError::AccessDenied(context.tenant_id));
        }
        
        Ok(())
    }
}

4.2 Repository Pattern

use foundationdb::{Database, Transaction};
use serde::{Serialize, Deserialize};

pub struct TenantAwareRepository<T> {
    db: Database,
    _phantom: std::marker::PhantomData<T>,
}

impl<T> TenantAwareRepository<T>
where
    T: Serialize + for<'de> Deserialize<'de>,
{
    pub fn new(db: Database) -> Self {
        Self {
            db,
            _phantom: std::marker::PhantomData,
        }
    }
    
    // Core method: All keys MUST be tenant-prefixed
    fn tenant_key(&self, tenant_id: &Uuid, resource_type: &str, id: &Uuid) -> String {
        format!("/{}/{}/{}", tenant_id, resource_type, id)
    }
    
    pub async fn create(
        &self,
        tenant_id: &Uuid,
        resource_type: &str,
        id: &Uuid,
        data: &T,
    ) -> Result<(), foundationdb::Error> {
        let txn = self.db.create_trx()?;
        let key = self.tenant_key(tenant_id, resource_type, id);
        let value = serde_json::to_vec(data).unwrap();
        
        txn.set(&key, &value);
        txn.commit().await?;
        Ok(())
    }
    
    pub async fn get(
        &self,
        tenant_id: &Uuid,
        resource_type: &str,
        id: &Uuid,
    ) -> Result<Option<T>, foundationdb::Error> {
        let txn = self.db.create_trx()?;
        let key = self.tenant_key(tenant_id, resource_type, id);
        
        match txn.get(&key, false).await? {
            Some(bytes) => {
                let data: T = serde_json::from_slice(&bytes).unwrap();
                Ok(Some(data))
            }
            None => Ok(None),
        }
    }
    
    // Range query within tenant boundary
    pub async fn get_range(
        &self,
        tenant_id: &Uuid,
        resource_type: &str,
    ) -> Result<Vec<T>, foundationdb::Error> {
        let txn = self.db.create_trx()?;
        let prefix = format!("/{}/{}/", tenant_id, resource_type);
        
        let range = txn.get_range(&prefix, &format!("{}~", prefix), None, false).await?;
        
        let mut results = Vec::new();
        for (_key, value) in range {
            let data: T = serde_json::from_slice(&value).unwrap();
            results.push(data);
        }
        
        Ok(results)
    }
}

4.3 Error Handling

#[derive(thiserror::Error, Debug)]
pub enum TenantError {
    #[error("Invalid tenant ID")]
    InvalidTenantId,
    
    #[error("Tenant not found: {0}")]
    NotFound(Uuid),
    
    #[error("Access denied for tenant: {0}")]
    AccessDenied(Uuid),
    
    #[error("Cross-tenant access detected: attempted {attempted} from {current}")]
    CrossTenantAccess { attempted: Uuid, current: Uuid },
    
    #[error("Database error: {0}")]
    DatabaseError(#[from] foundationdb::Error),
    
    #[error("Serialization error: {0}")]
    SerializationError(#[from] serde_json::Error),
}

impl From<TenantError> for actix_web::HttpResponse {
    fn from(err: TenantError) -> Self {
        match err {
            TenantError::NotFound(_) => HttpResponse::NotFound().json("Tenant not found"),
            TenantError::AccessDenied(_) => HttpResponse::Forbidden().json("Access denied"),
            TenantError::CrossTenantAccess { .. } => HttpResponse::Forbidden().json("Invalid access"),
            _ => HttpResponse::InternalServerError().json("Internal error"),
        }
    }
}

5. Testing Requirements

5.1 Unit Tests

#[cfg(test)]
mod tests {
    use super::*;
    use uuid::Uuid;
    
    #[tokio::test]
    async fn test_tenant_isolation() {
        let repo = setup_test_repo().await;
        let tenant_a = Uuid::new_v4();
        let tenant_b = Uuid::new_v4();
        let resource_id = Uuid::new_v4();
        
        let data_a = TestModel { value: "tenant-a-data" };
        let data_b = TestModel { value: "tenant-b-data" };
        
        // Store data for both tenants with same resource ID
        repo.create(&tenant_a, "test", &resource_id, &data_a).await.unwrap();
        repo.create(&tenant_b, "test", &resource_id, &data_b).await.unwrap();
        
        // Verify isolation
        let result_a = repo.get(&tenant_a, "test", &resource_id).await.unwrap().unwrap();
        let result_b = repo.get(&tenant_b, "test", &resource_id).await.unwrap().unwrap();
        
        assert_eq!(result_a.value, "tenant-a-data");
        assert_eq!(result_b.value, "tenant-b-data");
        
        // Verify no cross-tenant access
        let cross_attempt = repo.get(&tenant_a, "test", &resource_id).await.unwrap();
        assert_ne!(cross_attempt.unwrap().value, "tenant-b-data");
    }
    
    #[tokio::test]
    async fn test_range_query_isolation() {
        let repo = setup_test_repo().await;
        let tenant_a = Uuid::new_v4();
        let tenant_b = Uuid::new_v4();
        
        // Create multiple resources for each tenant
        for i in 0..5 {
            let id = Uuid::new_v4();
            let data_a = TestModel { value: format!("tenant-a-{}", i) };
            let data_b = TestModel { value: format!("tenant-b-{}", i) };
            
            repo.create(&tenant_a, "test", &id, &data_a).await.unwrap();
            repo.create(&tenant_b, "test", &id, &data_b).await.unwrap();
        }
        
        // Verify range queries are isolated
        let results_a = repo.get_range(&tenant_a, "test").await.unwrap();
        let results_b = repo.get_range(&tenant_b, "test").await.unwrap();
        
        assert_eq!(results_a.len(), 5);
        assert_eq!(results_b.len(), 5);
        
        // Verify no data leakage
        for item in &results_a {
            assert!(item.value.starts_with("tenant-a"));
        }
        for item in &results_b {
            assert!(item.value.starts_with("tenant-b"));
        }
    }
    
    #[tokio::test]
    async fn test_tenant_context_validation() {
        let service = MultiTenantService::new_test();
        
        // Valid context
        let valid_context = TenantContext::new(Uuid::new_v4(), Uuid::new_v4());
        assert!(valid_context.validate().is_ok());
        
        // Invalid context
        let invalid_context = TenantContext::new(Uuid::nil(), Uuid::new_v4());
        assert!(matches!(invalid_context.validate(), Err(TenantError::InvalidTenantId)));
    }
}

5.2 Integration Tests

#[tokio::test]
async fn test_api_tenant_isolation() {
    let app = test_app().await;
    let tenant_a_jwt = create_test_jwt("tenant-a", "user-1").await;
    let tenant_b_jwt = create_test_jwt("tenant-b", "user-2").await;
    
    // Create resource for tenant A
    let resp_a = app.post("/api/v1/projects")
        .bearer_auth(&tenant_a_jwt)
        .json(&json!({
            "name": "Project A"
        }))
        .send()
        .await
        .unwrap();
    
    assert_eq!(resp_a.status(), 201);
    let project_a: ProjectModel = resp_a.json().await.unwrap();
    
    // Try to access from tenant B (should fail)
    let resp_b = app.get(&format!("/api/v1/projects/{}", project_a.id))
        .bearer_auth(&tenant_b_jwt)
        .send()
        .await
        .unwrap();
    
    assert_eq!(resp_b.status(), 404); // Not found due to tenant isolation
    
    // Verify tenant A can still access
    let resp_a_get = app.get(&format!("/api/v1/projects/{}", project_a.id))
        .bearer_auth(&tenant_a_jwt)
        .send()
        .await
        .unwrap();
    
    assert_eq!(resp_a_get.status(), 200);
}

6. Security Controls

6.1 Authentication Middleware

use actix_web::{web, HttpRequest, HttpResponse, Error};
use actix_web_httpauth::middleware::HttpAuthentication;
use actix_web_httpauth::extractors::bearer::BearerAuth;

pub async fn jwt_validator(
    req: ServiceRequest,
    credentials: BearerAuth,
) -> Result<ServiceRequest, actix_web::Error> {
    let token = credentials.token();
    
    // Validate JWT and extract claims
    let claims = jwt::validate_token(token)
        .map_err(|_| AuthError::InvalidToken)?;
    
    // Create tenant context
    let tenant_context = TenantContext {
        tenant_id: claims.tenant_id,
        user_id: claims.user_id,
        permissions: claims.permissions,
    };
    
    // Add to request extensions
    req.extensions_mut().insert(tenant_context);
    
    Ok(req)
}

// Apply to all routes
pub fn configure_auth(cfg: &mut web::ServiceConfig) {
    let auth = HttpAuthentication::bearer(jwt_validator);
    cfg.wrap(auth);
}

6.2 Authorization Guards

use actix_web::{web, HttpRequest, Error};

pub struct TenantGuard;

impl TenantGuard {
    pub fn extract_context(req: &HttpRequest) -> Result<TenantContext, Error> {
        req.extensions()
            .get::<TenantContext>()
            .cloned()
            .ok_or_else(|| AuthError::MissingContext.into())
    }
    
    pub fn require_permission(
        context: &TenantContext,
        permission: &str,
    ) -> Result<(), Error> {
        if !context.permissions.contains(&permission.to_string()) {
            return Err(AuthError::InsufficientPermissions.into());
        }
        Ok(())
    }
}

// Usage in handlers
pub async fn create_project(
    req: HttpRequest,
    data: web::Json<CreateProjectRequest>,
) -> Result<HttpResponse, Error> {
    let context = TenantGuard::extract_context(&req)?;
    TenantGuard::require_permission(&context, "create_project")?;
    
    // Implementation with guaranteed tenant isolation
    let project = project_service
        .create_with_tenant(&context, data.into_inner())
        .await?;
    
    Ok(HttpResponse::Created().json(project))
}

7. Performance Specifications

7.1 Targets

API Response: p99 < 100ms for tenant-scoped operations
Database Query: p99 < 50ms for single-tenant range scans
Concurrent Tenants: Support 10,000+ active tenants
Memory Overhead: < 1MB per active tenant

7.2 Optimization Requirements

use std::collections::HashMap;
use tokio::sync::RwLock;
use uuid::Uuid;

// Tenant-aware caching
pub struct TenantCache<T> {
    cache: RwLock<HashMap<(Uuid, String), T>>,
    ttl: Duration,
}

impl<T> TenantCache<T>
where
    T: Clone,
{
    pub async fn get(&self, tenant_id: &Uuid, key: &str) -> Option<T> {
        let cache = self.cache.read().await;
        cache.get(&(*tenant_id, key.to_string())).cloned()
    }
    
    pub async fn set(&self, tenant_id: &Uuid, key: &str, value: T) {
        let mut cache = self.cache.write().await;
        cache.insert((*tenant_id, key.to_string()), value);
    }
    
    // Evict entire tenant from cache
    pub async fn evict_tenant(&self, tenant_id: &Uuid) {
        let mut cache = self.cache.write().await;
        cache.retain(|(tid, _), _| tid != tenant_id);
    }
}

// Connection pooling per tenant
pub struct TenantConnectionPool {
    pools: RwLock<HashMap<Uuid, Database>>,
}

impl TenantConnectionPool {
    pub async fn get_connection(&self, tenant_id: &Uuid) -> Database {
        let pools = self.pools.read().await;
        if let Some(db) = pools.get(tenant_id) {
            return db.clone();
        }
        
        drop(pools);
        
        // Create new connection for tenant
        let mut pools = self.pools.write().await;
        let db = Database::default();
        pools.insert(*tenant_id, db.clone());
        db
    }
}

8. Deployment Configuration

8.1 Container Specification

FROM rust:1.75-alpine AS builder
WORKDIR /app

# Copy manifests
COPY cargo.toml Cargo.lock ./

# Build dependencies (cached layer)
RUN mkdir src && echo "fn main() {}" > src/main.rs
RUN cargo build --release
RUN rm -rf src

# Copy source and build
COPY src ./src
RUN touch src/main.rs
RUN cargo build --release

FROM alpine:3.18
RUN apk add --no-cache ca-certificates
COPY --from=builder /app/target/release/coditect-api /usr/local/bin/

# Multi-tenant configuration
ENV TENANT_ISOLATION_LEVEL=strict
ENV MAX_TENANTS_PER_INSTANCE=10000
ENV TENANT_CACHE_SIZE=1000

EXPOSE 8080
CMD ["coditect-api"]

8.2 Cloud Run Configuration

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: coditect-multi-tenant-api
  annotations:
    run.googleapis.com/execution-environment: gen2
spec:
  template:
    metadata:
      annotations:
        run.googleapis.com/memory: "2Gi"
        run.googleapis.com/cpu: "2"
        autoscaling.knative.dev/maxScale: "100"
        autoscaling.knative.dev/minScale: "1"
    spec:
      containerConcurrency: 1000
      containers:
      - image: gcr.io/serene-voltage-464305-n2/coditect-api:latest
        env:
        - name: FOUNDATIONDB_CLUSTER_FILE
          value: /config/fdb.cluster
        - name: TENANT_ISOLATION_STRICT
          value: "true"
        resources:
          limits:
            memory: "2Gi"
            cpu: "2000m"

9. Monitoring & Observability

9.1 Metrics

use prometheus::{Counter, Histogram, Gauge};

lazy_static! {
    static ref TENANT_OPERATIONS: Counter = Counter::new(
        "coditect_tenant_operations_total", 
        "Total tenant operations"
    ).unwrap();
    
    static ref TENANT_RESPONSE_TIME: Histogram = Histogram::new(
        "coditect_tenant_response_duration_seconds", 
        "Tenant operation response time"
    ).unwrap();
    
    static ref ACTIVE_TENANTS: Gauge = Gauge::new(
        "coditect_active_tenants", 
        "Number of active tenants"
    ).unwrap();
    
    static ref TENANT_DATA_SIZE: Histogram = Histogram::new(
        "coditect_tenant_data_size_bytes", 
        "Data size per tenant"
    ).unwrap();
}

// Usage in service methods
impl MultiTenantService {
    pub async fn operation_with_metrics<T>(
        &self,
        tenant_id: &Uuid,
        operation_name: &str,
        operation: impl Future<Output = Result<T, TenantError>>,
    ) -> Result<T, TenantError> {
        let timer = TENANT_RESPONSE_TIME.start_timer();
        
        let result = operation.await;
        
        timer.observe_duration();
        TENANT_OPERATIONS
            .with_label_values(&[&tenant_id.to_string(), operation_name])
            .inc();
        
        if result.is_ok() {
            ACTIVE_TENANTS.inc();
        }
        
        result
    }
}

9.2 Logging

use tracing::{info, warn, error, instrument};
use uuid::Uuid;

#[instrument(skip(self), fields(tenant_id = %tenant_id))]
pub async fn create_resource(
    &self,
    tenant_id: &Uuid,
    req: CreateRequest,
) -> Result<ResourceModel, TenantError> {
    info!("Creating resource for tenant");
    
    // Validate tenant context
    self.validate_tenant_access(tenant_id).await
        .map_err(|e| {
            warn!("Tenant access validation failed: {}", e);
            e
        })?;
    
    // Create resource
    let resource = ResourceModel::new(tenant_id.clone(), req);
    
    // Store with tenant isolation
    self.repository
        .create(tenant_id, "resources", &resource.id, &resource)
        .await
        .map_err(|e| {
            error!("Failed to store resource: {}", e);
            TenantError::DatabaseError(e)
        })?;
    
    info!("Resource created successfully: {}", resource.id);
    Ok(resource)
}

10. Constraints for AI Implementation

10.1 MUST Requirements

MUST prefix all FoundationDB keys with /{tenant_id}/
MUST validate tenant context in every service method
MUST implement range query boundaries to prevent cross-tenant access
MUST use TenantContext struct for all operations
MUST include tenant_id in all log entries and metrics
MUST implement comprehensive isolation tests
MUST handle tenant creation, modification, and deletion atomically

10.2 MUST NOT Requirements

MUST NOT allow direct database access without tenant prefixing
MUST NOT cache data across tenant boundaries
MUST NOT use global/shared database connections without tenant validation
MUST NOT expose tenant IDs to unauthorized users
MUST NOT implement tenant-aware functionality without proper testing
MUST NOT bypass tenant validation in any code path
MUST NOT store tenant-sensitive data in logs or metrics

10.3 Test Coverage Requirements

Unit tests for all tenant isolation scenarios
Integration tests for API-level tenant separation
Load tests with multiple concurrent tenants
Security tests for cross-tenant access attempts
Range query tests to verify boundary enforcement
Cache isolation tests
Error handling tests for tenant validation failures

10.4 Key Implementation Patterns

// ALWAYS use this pattern for tenant operations
async fn tenant_operation<T>(
    &self,
    context: &TenantContext,
    operation: impl Fn(&Uuid) -> Result<T, Error>,
) -> Result<T, TenantError> {
    context.validate()?;
    self.validate_tenant_access(context).await?;
    operation(&context.tenant_id).map_err(Into::into)
}

// ALWAYS prefix keys like this
fn make_tenant_key(tenant_id: &Uuid, resource_type: &str, id: &Uuid) -> String {
    format!("/{}/{}/{}", tenant_id, resource_type, id)
}

// ALWAYS validate tenant boundaries in ranges
async fn tenant_range_query(
    &self,
    tenant_id: &Uuid,
    resource_type: &str,
) -> Result<Vec<T>, Error> {
    let start_key = format!("/{}/{}/", tenant_id, resource_type);
    let end_key = format!("/{}/{}~", tenant_id, resource_type);
    // Query implementation with strict boundaries
}

Document Specification Block​

1. Technical Requirements​

1.1 Constraints​

1.2 Dependencies​

2. Component Architecture​

2.1 System Components​

2.2 Data Flow​

3. Implementation Specifications​

3.1 API Endpoints​

3.2 Data Models​

3.3 Database Schema​

4. Core Implementation​

4.1 Service Implementation​

4.2 Repository Pattern​

4.3 Error Handling​

5. Testing Requirements​

5.1 Unit Tests​

5.2 Integration Tests​

6. Security Controls​

6.1 Authentication Middleware​

6.2 Authorization Guards​

7. Performance Specifications​

7.1 Targets​

7.2 Optimization Requirements​

8. Deployment Configuration​

8.1 Container Specification​

8.2 Cloud Run Configuration​

9. Monitoring & Observability​

9.1 Metrics​

9.2 Logging​

10. Constraints for AI Implementation​

10.1 MUST Requirements​

10.2 MUST NOT Requirements​

10.3 Test Coverage Requirements​

10.4 Key Implementation Patterns​

Document Specification Block

1. Technical Requirements

1.1 Constraints

1.2 Dependencies

2. Component Architecture

2.1 System Components

2.2 Data Flow

3. Implementation Specifications

3.1 API Endpoints

3.2 Data Models

3.3 Database Schema

4. Core Implementation

4.1 Service Implementation

4.2 Repository Pattern

4.3 Error Handling

5. Testing Requirements

5.1 Unit Tests

5.2 Integration Tests

6. Security Controls

6.1 Authentication Middleware

6.2 Authorization Guards

7. Performance Specifications

7.1 Targets

7.2 Optimization Requirements

8. Deployment Configuration

8.1 Container Specification

8.2 Cloud Run Configuration

9. Monitoring & Observability

9.1 Metrics

9.2 Logging

10. Constraints for AI Implementation

10.1 MUST Requirements

10.2 MUST NOT Requirements

10.3 Test Coverage Requirements

10.4 Key Implementation Patterns