Skip to main content

ADR-184: Customer Backup Architecture

Status

Proposed - 2026-01-18

Context

Problem Statement

CODITECT customers need to backup and restore their context database and Claude configuration files. The current backup solution (backup-context-db.sh) requires:

  • Direct GCS access via gcloud CLI
  • GCP project with billing enabled
  • Technical expertise to configure

This creates barriers for customers who:

  • Don't have GCP accounts
  • Lack cloud infrastructure expertise
  • Need enterprise compliance (audit trails, retention policies)
  • Want to sync context across multiple machines

Current State

Developer Machine


backup-context-db.sh

▼ (requires gcloud auth)
gs://{PROJECT_ID}-context-backups/

Requirements

RequirementPriorityDescription
R1MustCustomers backup without GCP accounts
R2MustMulti-tenant isolation (customers can't see each other's data)
R3MustIntegration with existing license system
R4ShouldTiered retention by subscription plan
R5ShouldEncryption at rest
R6ShouldAudit logging
R7CouldAutomatic scheduled backups
R8CouldCross-machine restore

Decision

Solution Overview

Implement a Cloud-Mediated Backup System where:

  1. Customers authenticate via their CODITECT license key
  2. Backups are uploaded through api.coditect.ai
  3. Storage is managed by CODITECT in tenant-isolated GCS paths
  4. Retention is enforced based on subscription tier

Architecture

┌─────────────────────────────────────────────────────────────────┐
│ CUSTOMER MACHINE │
│ │
│ ~/.coditect/ │
│ ├── context-storage/ │
│ │ ├── context.db ─┐ │
│ │ ├── unified_messages.jsonl │ │
│ │ ├── unified_hashes.json ├── Backup payload │
│ │ └── unified_stats.json ─┘ │
│ └── licensing/ │
│ └── license.json ── API authentication │
│ │
│ ~/.claude/ │
│ ├── settings.json ─┐ │
│ ├── settings.local.json ├── Claude config backup │
│ └── statusline-config.json ─┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ $ /backup --cloud │ │
│ │ │ │
│ │ 1. Read license key from license.json │ │
│ │ 2. Compress files (gzip) │ │
│ │ 3. POST to api.coditect.ai/api/v1/backup/upload │ │
│ │ 4. Receive backup_id and confirmation │ │
│ └──────────────────────────────────────────────────────┘ │
└────────────────────────────────┬────────────────────────────────┘

│ HTTPS + License Key Auth

┌─────────────────────────────────────────────────────────────────┐
│ api.coditect.ai │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Authentication │ │
│ │ • Validate license key │ │
│ │ • Extract tenant_id, user_id │ │
│ │ • Check license tier for quota │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ BackupService │ │
│ │ │ │
│ │ create_backup(files): │ │
│ │ 1. Validate file sizes against tier quota │ │
│ │ 2. Generate storage path: /{tenant}/{user}/{ts}/ │ │
│ │ 3. Encrypt files with tenant key │ │
│ │ 4. Upload to GCS │ │
│ │ 5. Create Backup record in PostgreSQL │ │
│ │ 6. Return backup_id │ │
│ │ │ │
│ │ list_backups(): │ │
│ │ • Auto-filtered by tenant (django-multitenant) │ │
│ │ │ │
│ │ get_download_url(backup_id): │ │
│ │ • Generate signed URL (1 hour expiry) │ │
│ │ • Log access for audit │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ PostgreSQL (Multi-Tenant) │ │
│ │ │ │
│ │ backup_backup: │ │
│ │ id, tenant_id, user_id, created_at, size_bytes, │ │
│ │ storage_path, manifest, status, retention_until │ │
│ └──────────────────────────────────────────────────────┘ │
└────────────────────────────────┬────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ Google Cloud Storage │
│ │
│ gs://coditect-customer-backups/ │
│ │ │
│ ├── {tenant_uuid}/ ◄── Tenant isolation │
│ │ ├── {user_uuid}/ ◄── User isolation │
│ │ │ ├── 2026-01-18T00-30-00/ │
│ │ │ │ ├── manifest.json │
│ │ │ │ ├── context.db.gz.enc │
│ │ │ │ ├── unified_messages.jsonl.gz.enc │
│ │ │ │ ├── unified_hashes.json.gz.enc │
│ │ │ │ ├── unified_stats.json.gz.enc │
│ │ │ │ └── claude-config/ │
│ │ │ │ ├── settings.json.gz.enc │
│ │ │ │ ├── settings.local.json.gz.enc │
│ │ │ │ └── statusline-config.json.gz.enc │
│ │ │ └── 2026-01-17T12-00-00/ │
│ │ │ └── ... │
│ │ └── {another_user_uuid}/ │
│ │ └── ... │
│ └── {another_tenant_uuid}/ │
│ └── ... │
│ │
│ Lifecycle Policy: │
│ • Free tier: Delete after 7 days │
│ • Pro tier: Delete after 30 days │
│ • Enterprise: Custom retention │
└─────────────────────────────────────────────────────────────────┘

API Endpoints

MethodEndpointDescription
POST/api/v1/backup/uploadUpload new backup (multipart)
GET/api/v1/backup/List user's backups
GET/api/v1/backup/{id}Get backup details
GET/api/v1/backup/{id}/downloadGet signed download URL
DELETE/api/v1/backup/{id}Delete backup

Client Usage

# Prerequisites: License activated
/license-activate ABC123-XYZ789

# Create backup (auto-detects cloud mode when licensed)
/backup
# Output: Backup created: bk_abc123 (1.2GB compressed)

# List backups
/backup --list
# Output:
# ID Created Size Retention
# bk_abc123 2026-01-18 00:30 1.2GB until 2026-02-17
# bk_xyz789 2026-01-17 12:00 1.1GB until 2026-02-16

# Restore latest
/backup --restore latest

# Restore specific backup
/backup --restore bk_xyz789

# Check status
/backup --status

Pricing Tiers

TierRetentionMax BackupsMax Total SizeAuto-Backup
Free7 days31 GBNo
Pro30 days1010 GBOptional
Team90 days3050 GBDaily
EnterpriseCustomUnlimitedUnlimitedConfigurable

Security Model

  1. Authentication: License key validated on every request
  2. Authorization: django-multitenant ensures tenant isolation
  3. Encryption:
    • In transit: HTTPS/TLS 1.3
    • At rest: AES-256 with tenant-specific keys
  4. Signed URLs: Download URLs expire after 1 hour
  5. Audit Logging: All backup operations logged with user/IP

Manifest Format

{
"version": "1.0",
"created_at": "2026-01-18T00:30:00Z",
"client_version": "2.8.0",
"machine_id": "abc123...",
"files": [
{
"name": "context.db",
"size_original": 8900000000,
"size_compressed": 1800000000,
"checksum_sha256": "abc123...",
"encrypted": true
},
{
"name": "unified_messages.jsonl",
"size_original": 1000000000,
"size_compressed": 210000000,
"checksum_sha256": "def456...",
"encrypted": true
}
],
"total_size_original": 9900000000,
"total_size_compressed": 2010000000
}

Consequences

Positive

  • No GCP Required: Customers backup with just a license key
  • Automatic Isolation: Multi-tenant architecture prevents data leaks
  • Compliance Ready: Audit logs, retention policies, encryption
  • Revenue Opportunity: Tiered storage drives upgrades
  • Simplified UX: /backup just works after license activation

Negative

  • Operational Overhead: CODITECT manages customer backup storage
  • Cost: GCS storage costs scale with customer usage
  • Latency: Cloud backup slower than local GCS (extra hop)

Mitigations

RiskMitigation
Storage costsAggressive compression, retention policies
Large uploads failChunked upload with resume
API availabilityFallback to local GCS if available

Alternatives Considered

1. Customer-Provided GCS Bucket

Rejected: Too complex for most customers; enterprise-only feature at best.

2. S3-Compatible Storage (MinIO, etc.)

Rejected: Adds infrastructure complexity; GCS sufficient for now.

3. Peer-to-Peer Sync

Rejected: Security concerns; doesn't solve backup problem.

Implementation Plan

See: FEATURE-CUSTOMER-BACKUP.md

Phase 1: Backend (Week 1)

  • A.11.1-A.11.7: Models, services, views, storage

Phase 2: Infrastructure (Week 1)

  • C.6.1-C.6.4: GCS bucket, IAM, lifecycle

Phase 3: Client (Week 2)

  • A.11.8-A.11.10: Script updates, auto-detection

Phase 4: Testing & Docs (Week 2)

  • E.6.1-E.6.4: Unit, integration, isolation tests
  • F.5.1-F.5.3: Guides, API docs

References


Decision Date: 2026-01-18 Review Date: 2026-02-01 Author: Claude Opus 4.5 + Hal Casteel