ADR-024: FastAPI Web Framework Selection

Status: Accepted Date: 2025-12-28 Deciders: Hal Casteel Categories: Architecture, Backend, API

Context

CODITECT Document Management System requires a robust API layer to expose document management, semantic search, and analytics capabilities. The API must support:

Requirements

High-Performance Vector Search: Sub-100ms p95 latency for semantic search queries
Async Database Operations: Non-blocking PostgreSQL (pgvector) and Redis operations
Real-Time Embedding Generation: Concurrent calls to OpenAI/Anthropic embedding APIs
Multi-Tenant Isolation: Secure tenant separation with proper authentication
API Documentation: Auto-generated OpenAPI/Swagger documentation
Type Safety: Strong typing for request/response validation
Production Readiness: Battle-tested for enterprise deployment

Constraints

Must integrate with existing Python ML/AI ecosystem (embeddings, vector operations)
Must support async patterns for I/O-bound operations
Must provide automatic request validation
Must generate API documentation without manual maintenance
Must have proven production track record at scale

Decision

We will use FastAPI as the primary web framework for the CODITECT Document Management System API because it provides the optimal combination of async performance, type safety, automatic documentation, and Python ecosystem integration required for our semantic search and document management use cases.

Alternatives Considered

1. Django REST Framework (DRF)

Pros:

Mature, battle-tested framework
Excellent admin interface
Strong ORM integration
Large community and ecosystem

Cons:

Sync-first architecture (async support added later, not native)
Heavier footprint for API-only services
ORM doesn't natively support pgvector
Slower request processing compared to ASGI frameworks

Rejection Reason: Our use case requires async I/O for vector search, embedding APIs, and Redis operations. Django's sync-first architecture would require additional complexity for proper async handling.

2. Flask

Pros:

Lightweight and flexible
Simple to get started
Large extension ecosystem

Cons:

No native async support (requires additional libraries)
No built-in request validation
No automatic API documentation
Manual type checking required

Rejection Reason: Lacks native async support and automatic OpenAPI generation, which are critical for our requirements.

3. Starlette (FastAPI's Foundation)

Pros:

Maximum control and flexibility
Minimal overhead
Pure ASGI implementation

Cons:

No built-in request/response validation
No automatic OpenAPI generation
Requires more boilerplate code

Rejection Reason: Would require reimplementing features that FastAPI provides out of the box (Pydantic validation, OpenAPI generation).

4. gRPC (Python)

Pros:

Binary protocol (efficient)
Strong typing via Protocol Buffers
Excellent for microservices communication

Cons:

No browser client support without gateway
More complex tooling
Protocol Buffer management overhead

Rejection Reason: Our API needs to be consumable by web browsers and third-party integrations, which requires REST/HTTP. gRPC may be considered for internal service-to-service communication in future.

5. Go Frameworks (Gin, Echo, Fiber)

Pros:

Superior raw performance
Lower memory footprint
Excellent concurrency model

Cons:

Separate language from ML/AI code
No native pgvector integration
Would require polyglot architecture

Rejection Reason: Our embedding and vector operations are Python-native. Introducing Go would create language boundaries and complicate deployment.

Rationale

Why FastAPI?

1. Native Async Support

FastAPI is built on Starlette (ASGI) with first-class async support:

@app.get("/documents/search")
async def search_documents(query: str, top_k: int = 10):
    # All I/O operations are non-blocking
    embedding = await embedding_service.embed_text(query)
    results = await search_service.vector_search(embedding, top_k)
    return results

This is critical for:

pgvector similarity searches
Redis cache operations
OpenAI/Anthropic embedding API calls
Concurrent request handling

2. Automatic Request Validation (Pydantic)

Type-safe request/response handling with automatic validation:

class SearchRequest(BaseModel):
    query: str = Field(..., min_length=1, max_length=10000)
    top_k: int = Field(default=10, ge=1, le=100)
    tenant_id: UUID
    min_score: float = Field(default=0.0, ge=0.0, le=1.0)

@app.post("/documents/search")
async def search(request: SearchRequest) -> SearchResponse:
    # Request is already validated
    return await search_service.search(request)

3. Automatic OpenAPI Documentation

Zero-configuration API documentation:

Swagger UI at /docs
ReDoc at /redoc
OpenAPI JSON at /openapi.json

This eliminates documentation drift and enables:

Client SDK generation
API testing tools integration
Developer self-service

4. Performance Benchmarks

FastAPI consistently ranks among the fastest Python frameworks:

Framework	Requests/sec	Latency (p95)
FastAPI	15,000+	<10ms
Django REST	2,000-5,000	30-50ms
Flask	3,000-6,000	20-40ms

Benchmarks for simple JSON endpoints. Source: TechEmpower Framework Benchmarks

5. Production Track Record

FastAPI is used in production by:

Microsoft - Various internal services
Netflix - Internal tools and APIs
Uber - Dispatch services
Stripe - Internal tooling

This validates enterprise-readiness and scalability.

6. Python Ecosystem Integration

Seamless integration with our stack:

SQLAlchemy 2.0 - Async ORM support
asyncpg - Native async PostgreSQL driver
pgvector - Vector similarity operations
Redis - Async client (redis-py with aioredis)
Pydantic V2 - Data validation and settings
OpenAI/Anthropic SDKs - Async embedding generation

Consequences

Positive

Performance: Async I/O enables high concurrency for database and API operations
Developer Experience: Type hints + auto-docs reduce development time
Type Safety: Pydantic validation catches errors at request time
Documentation: API docs always in sync with code
Testing: Easy to test with TestClient and pytest-asyncio
Ecosystem: Strong integration with Python ML/AI libraries

Negative

Learning Curve: Developers unfamiliar with async patterns need training
Debugging Complexity: Async stack traces can be harder to follow
Dependency on Pydantic: Tight coupling to Pydantic for validation

Mitigations

Async Training: Document async patterns in developer guides
Structured Logging: Implement correlation IDs for request tracing
Pydantic Abstraction: Use interface layer to minimize coupling

Implementation Guidelines

Project Structure

src/backend/
├── api/
│   ├── __init__.py
│   ├── main.py              # FastAPI app instance
│   ├── dependencies.py      # Dependency injection
│   ├── middleware.py        # Custom middleware
│   └── routes/
│       ├── documents.py     # Document CRUD
│       ├── search.py        # Semantic search
│       ├── analytics.py     # Metrics/analytics
│       └── auth.py          # Authentication
├── schemas/                  # Pydantic models
│   ├── documents.py
│   ├── search.py
│   └── common.py
├── services/                 # Business logic
└── models/                   # SQLAlchemy models

Standard Patterns

Dependency Injection

async def get_db() -> AsyncGenerator[AsyncSession, None]:
    async with async_session() as session:
        yield session

async def get_current_tenant(
    token: str = Depends(oauth2_scheme),
    db: AsyncSession = Depends(get_db),
) -> Tenant:
    return await auth_service.validate_token(token, db)

@app.get("/documents")
async def list_documents(
    tenant: Tenant = Depends(get_current_tenant),
    db: AsyncSession = Depends(get_db),
):
    return await document_service.list(tenant.id, db)

Error Handling

class DocumentNotFoundError(HTTPException):
    def __init__(self, doc_id: UUID):
        super().__init__(
            status_code=404,
            detail=f"Document {doc_id} not found"
        )

@app.exception_handler(ValidationError)
async def validation_exception_handler(request, exc):
    return JSONResponse(
        status_code=422,
        content={"detail": exc.errors()}
    )

Compliance

Security Requirements

JWT authentication via python-jose
CORS configuration for allowed origins
Rate limiting via middleware
Request size limits
Security headers (HSTS, CSP, etc.)

Performance Requirements

Connection pooling for PostgreSQL
Redis connection pooling
Response compression (gzip)
Cache headers for static responses

Monitoring Requirements

Prometheus metrics endpoint (/metrics)
Health check endpoint (/health)
OpenTelemetry tracing integration

Revision History

Version	Date	Author	Changes
1.0.0	2025-12-28	Hal Casteel	Initial ADR creation

Context​

Requirements​

Constraints​

Decision​

Alternatives Considered​

1. Django REST Framework (DRF)​

2. Flask​

3. Starlette (FastAPI's Foundation)​

4. gRPC (Python)​

5. Go Frameworks (Gin, Echo, Fiber)​

Rationale​

Why FastAPI?​

1. Native Async Support​

2. Automatic Request Validation (Pydantic)​

3. Automatic OpenAPI Documentation​

4. Performance Benchmarks​

5. Production Track Record​

6. Python Ecosystem Integration​

Consequences​

Positive​

Negative​

Mitigations​

Implementation Guidelines​

Project Structure​

Standard Patterns​

Dependency Injection​

Error Handling​

Compliance​

Security Requirements​

Performance Requirements​

Monitoring Requirements​

Related Documents​

Revision History​

Context

Requirements

Constraints

Decision

Alternatives Considered

1. Django REST Framework (DRF)

2. Flask

3. Starlette (FastAPI's Foundation)

4. gRPC (Python)

5. Go Frameworks (Gin, Echo, Fiber)

Rationale

Why FastAPI?

1. Native Async Support

2. Automatic Request Validation (Pydantic)

3. Automatic OpenAPI Documentation

4. Performance Benchmarks

5. Production Track Record

6. Python Ecosystem Integration

Consequences

Positive

Negative

Mitigations

Implementation Guidelines

Project Structure

Standard Patterns

Dependency Injection

Error Handling

Compliance

Security Requirements

Performance Requirements

Monitoring Requirements

Related Documents

Revision History