ADR-005-v4: Authentication & Authorization Architecture (Part 1: Narrative)
Document: ADR-005-v4-authentication-authorization-part1-narrative
Version: 3.0.0
Purpose: Define comprehensive authentication and authorization strategy for human understanding
Audience: Business stakeholders, developers, security teams
Date Created: 2025-08-30
Date Modified: 2025-09-03
QA Reviewed: Pending
Status: UPDATED_FOR_STATEFULSETS
Supersedes: v2.0.0
Changes: Replaced ephemeral containers with GKE StatefulSets
Table of Contents
- 1. Document Information
- 2. Purpose of this ADR
- 3. User Story Context
- 4. Executive Summary
- 5. Visual Overview
- 6. Background & Problem
- 7. Decision
- 8. Implementation Blueprint
- 9. Testing Strategy
- 10. Security Considerations
- 11. Performance Characteristics
- 12. Operational Considerations
- 13. Migration Strategy
- 14. Consequences
- 15. References & Standards
- 16. Review & Approval
- 17. Appendix
- 18. QA Review Block
1. Document Information 🔴 REQUIRED
| Field | Value |
|---|---|
| ADR Number | ADR-005 |
| Title | Authentication & Authorization Architecture |
| Status | Draft |
| Date Created | 2025-08-30 |
| Last Modified | 2025-09-03 |
| Version | 3.0.0 |
| Decision Makers | CTO, Security Officer, Lead Architect |
| Stakeholders | All CODITECT teams, customers, compliance |
2. Purpose of this ADR 🔴 REQUIRED
This ADR serves dual purposes:
- For Humans 👥: Understand how CODITECT secures access for users, AI agents, and services across multiple tenants
- For AI Agents 🤖: Implement JWT-based authentication with RBAC and comprehensive audit trails
3. User Story Context 🔴 REQUIRED
As a platform user,
I want secure login with proper permissions,
So that I can access only my organization's resources and persistent workspaces safely.
As an organization admin,
I want to manage user roles and permissions,
So that I can control who accesses what within my tenant.
As an AI agent,
I want authenticated API access,
So that I can perform authorized operations on behalf of users.
📋 Acceptance Criteria:
- JWT-based authentication for all actors
- Multi-tenant isolation enforced
- Role-based access control (RBAC)
- OAuth2/SSO integration support
- Complete audit trail of all auth events
- Token refresh without re-authentication
- API rate limiting per user/tenant
4. Executive Summary 🔴 REQUIRED
🏢 For Business Stakeholders
Think of CODITECT's authentication system like a sophisticated office building's security system. Every person gets a personalized key card (JWT token) that knows:
- Who they are (identity)
- Which company they work for (tenant)
- Which floors they can access (roles)
- Which rooms they can enter (permissions)
The system automatically tracks every door opened and ensures employees can never accidentally enter competitors' offices, even if they try.
Business Value:
- Zero data breaches from cross-tenant access
- 90% reduction in permission management overhead
- SOC 2 compliance out-of-the-box
- Enterprise SSO enabling 5x faster onboarding
Key Decision: Implement JWT-based multi-tenant authentication with RBAC and comprehensive auditing.
💻 For Technical Readers
Technical Summary: Stateless JWT authentication with tenant-scoped claims, Argon2 password hashing, OAuth2 SSO integration, and FoundationDB-backed session management with automatic audit logging. Seamlessly integrates with GKE StatefulSet workspaces for persistent development environments.
5. Visual Overview 🔴 REQUIRED
5.1 Authentication Flow
5.2 JWT Token Structure
5.3 Multi-Tenant Access Control
6. Background & Problem 🔴 REQUIRED
6.1 Business Context
Why this matters:
- Data Security: One breach can destroy customer trust forever
- Compliance: GDPR, SOC 2, HIPAA require strict access controls
- Enterprise Sales: 80% of enterprise deals require SSO
- Operational Efficiency: Manual permission management doesn't scale
User impact:
- Password fatigue from multiple systems
- Slow onboarding to new projects
- Accidental access to wrong tenant's data
- Frustration with permission requests
Cost of inaction:
- $4.45M average data breach cost
- 6-month sales cycles without SSO
- 20% of IT time on access management
- Failed compliance audits
6.2 Technical Context
Current state in industry:
- Simple session-based auth
- Single-tenant assumptions
- Manual permission management
- No AI agent considerations
- Poor audit trails
Limitations:
- Sessions don't scale horizontally
- Permission checks hit database
- No standard for AI auth
- Audit logs incomplete
- SSO requires custom work
Technical debt:
- Hardcoded permissions
- Mixed auth patterns
- Incomplete tenant isolation
- Missing rate limiting
- No token rotation
6.3 Constraints
| Type | Constraint | Impact |
|---|---|---|
| ⏰ Time | 3-month implementation | Phased rollout required |
| 💰 Budget | Use existing infrastructure | Leverage FoundationDB |
| 👥 Resources | Current security team | Must be maintainable |
| 🔧 Technical | Stateless architecture | JWT-based solution |
| 📜 Compliance | SOC 2, GDPR requirements | Audit everything |
7. Decision 🔴 REQUIRED
7.1 Y-Statement Format
In the context of securing multi-tenant access for humans and AI agents,
facing complex permission requirements and compliance needs,
we decided for JWT-based authentication with embedded tenant claims
and neglected session-based auth and external permission services,
to achieve stateless scalability, complete audit trails, and SSO support,
accepting larger token sizes and token rotation complexity,
because horizontal scalability and compliance are critical for growth.
7.2 What We're Doing
Implementing comprehensive authentication and authorization:
-
JWT Token System
- Stateless authentication
- Rich claims with tenant context
- Short-lived with refresh tokens
- Signed with RS256
-
Multi-Actor Support
- Human users with email/password
- AI agents with API keys
- Service accounts for automation
- OAuth2/SAML for enterprise SSO
-
RBAC Implementation
- Tenant-scoped roles
- Fine-grained permissions
- Dynamic permission evaluation
- Role inheritance
-
Audit System
- Every auth event logged
- Immutable audit trail
- Compliance reporting
- Anomaly detection
7.3 Why This Approach
JWT tokens provide:
- Stateless horizontal scaling
- Complete context in each request
- Standard integration patterns
- Mobile/API friendliness
Multi-tenant claims ensure:
- Absolute tenant isolation
- Fast permission checks
- Audit trail completeness
- Simplified debugging
This approach balances:
- Security vs usability
- Performance vs features
- Compliance vs complexity
- Present needs vs future growth
7.4 Alternatives Considered 🟡 OPTIONAL
Option A: Session-Based Auth
| Aspect | Details |
|---|---|
| Description | Traditional server-side sessions |
| ✅ Pros | • Simple implementation • Instant revocation • Smaller requests |
| ❌ Cons | • Doesn't scale horizontally • Database hit per request • Complex for APIs |
| Rejection Reason | Incompatible with stateless architecture |
Option B: External Auth Service
| Aspect | Details |
|---|---|
| Description | Dedicated auth microservice (like Auth0) |
| ✅ Pros | • Feature-rich • Maintained by experts • Quick setup |
| ❌ Cons | • Vendor lock-in • Additional latency • Cost at scale • Less control |
| Rejection Reason | Need full control for multi-tenant customization |
8. Implementation Blueprint 🔴 REQUIRED
8.1 Architecture Overview
8.2 Core Components
Authentication Service: Handles login, token generation, and validation. See Part 2 for implementation.
Token Service: Manages JWT creation, refresh, and revocation. See Part 2 for details.
RBAC Engine: Evaluates permissions based on roles and tenant context. Complete implementation in Part 2.
8.3 Configuration
All authentication configuration managed through:
- Environment variables in GKE StatefulSets
- ConfigMaps for non-sensitive settings
- Kubernetes Secrets for sensitive data
- FoundationDB for dynamic configuration
StatefulSet pods inherit authentication configuration at startup, ensuring consistent security across workspace restarts. See Part 2 for schemas.
8.4 API Endpoints
| Endpoint | Method | Purpose |
|---|---|---|
/auth/login | POST | User authentication |
/auth/refresh | POST | Token refresh |
/auth/logout | POST | Token revocation |
/auth/sso/{provider} | GET | SSO initiation |
/auth/api-key | POST | Generate API key |
8.5 Logging Requirements
All auth events must be logged with:
- Actor identification
- Action performed
- Resource accessed
- Result (success/failure)
- Client information
Detailed logging patterns in Part 2.
8.6 Error Handling
Authentication errors must:
- Never leak sensitive information
- Use consistent error codes
- Include correlation IDs
- Trigger appropriate alerts
See Part 2 for implementation.
9. Testing Strategy 🔴 REQUIRED
9.1 Test Scenarios
-
Authentication Tests
- Valid login flow
- Invalid credentials
- Account lockout
- Password reset
- SSO integration
-
Authorization Tests
- Role-based access
- Permission inheritance
- Cross-tenant isolation
- API rate limiting
- Token expiration
-
Security Tests
- Token tampering
- Replay attacks
- Privilege escalation
- Session hijacking
- Brute force protection
9.2 Performance Tests
| Test | Target | Method |
|---|---|---|
| Login throughput | 1000/sec | Load test |
| Token validation | <5ms | Benchmark |
| Permission check | <10ms | Profile |
| SSO round-trip | <2sec | E2E test |
9.3 Test Coverage Requirements
| Component | Unit | Integration | E2E |
|---|---|---|---|
| Auth Service | ≥90% | ≥80% | ≥70% |
| Token Service | ≥95% | ≥85% | ≥70% |
| RBAC Engine | ≥95% | ≥90% | ≥80% |
| Audit Logger | ≥85% | ≥80% | ≥60% |
10. Security Considerations 🔴 REQUIRED
10.1 Token Security
Protection measures:
- RS256 signing (asymmetric)
- Short expiration (15 minutes)
- Secure refresh tokens
- Token binding to IP
- Automatic rotation
Storage requirements:
- Never store in localStorage
- HttpOnly, Secure cookies
- SameSite protection
- Encrypted at rest
10.2 Password Security
Requirements:
- Argon2id hashing
- Minimum complexity rules
- Breach database checking
- Regular rotation reminders
- No password reuse
10.3 Threat Model
| Threat | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Token theft | Medium | High | Short expiration, binding |
| Brute force | High | Medium | Rate limiting, captcha |
| Privilege escalation | Low | Critical | Audit, permission validation |
| Cross-tenant access | Low | Critical | Tenant validation layer |
11. Performance Characteristics 🔴 REQUIRED
11.1 Expected Metrics
| Operation | Target | Actual | Notes |
|---|---|---|---|
| Login | <200ms | TBD | Including password check |
| Token generation | <50ms | TBD | JWT creation |
| Token validation | <5ms | TBD | Signature verification |
| Permission check | <10ms | TBD | RBAC evaluation |
| Audit write | <20ms | TBD | Async to FDB |
11.2 Scalability
Horizontal scaling:
- Stateless auth service
- JWT validation at edge
- Distributed cache for revocation
- Read replicas for permissions
Bottlenecks:
- Password hashing (CPU)
- SSO provider latency
- Audit write throughput
- Token size in headers
12. Operational Considerations 🔴 REQUIRED
12.1 Monitoring
| Metric | Alert Threshold | Action |
|---|---|---|
| Failed logins | >100/min | Check for attack |
| Token errors | >1% | Review JWT config |
| SSO failures | >5% | Check provider |
| Audit lag | >5sec | Scale writers |
12.2 Maintenance
Regular tasks:
- Rotate signing keys quarterly
- Review unused permissions
- Archive old audit logs
- Update OAuth2 providers
- Check password breaches
12.3 Emergency Procedures
Token compromise:
- Revoke all tokens for user
- Force password reset
- Review audit logs
- Notify security team
System compromise:
- Rotate all signing keys
- Invalidate all sessions
- Force global re-auth
- Full security audit
13. Migration Strategy 🔴 REQUIRED
13.1 Phase 1: Foundation (Month 1)
- Deploy auth service
- Implement JWT generation
- Setup token validation
- Create audit pipeline
13.2 Phase 2: Migration (Month 2)
- Migrate existing users
- Convert permissions
- Enable SSO providers
- Update all services
13.3 Phase 3: Deprecation (Month 3)
- Remove old auth code
- Archive legacy data
- Update documentation
- Security audit
13.4 Rollback Plan
If issues arise:
- Keep old auth running
- Dual auth period
- Gradual migration
- Quick rollback switch
14. Consequences 🔴 REQUIRED
14.1 Positive Outcomes
✅ Security improvements:
- Zero cross-tenant breaches
- Complete audit trail
- Faster incident response
- Reduced attack surface
✅ Business benefits:
- Enterprise SSO capability
- SOC 2 compliance ready
- 90% faster onboarding
- Reduced support tickets
✅ Technical advantages:
- Horizontal scalability
- Stateless architecture
- Standard JWT integration
- API-first design
14.2 Negative Impacts
⚠️ Increased complexity:
- Token rotation logic
- Key management overhead
- Debugging JWT issues
- Larger request headers
⚠️ Operational overhead:
- Key rotation process
- Token size monitoring
- Audit log management
- SSO provider updates
⚠️ Migration effort:
- User data conversion
- Service updates required
- Client library changes
- Testing overhead
15. References & Standards 🔴 REQUIRED
15.1 Related ADRs
- ADR-003-v4: Multi-tenant architecture
- ADR-008-v4: Monitoring auth events
- ADR-011-v4: Audit requirements
- LOGGING-STANDARD-v4: Logging patterns
15.2 External Standards
- OAuth 2.0: Authorization framework
- OpenID Connect: Identity layer
- JWT RFC 7519: Token standard
- OWASP Auth: Security guidelines
15.3 Best Practices
- NIST 800-63B: Authentication guidelines
- SANS Password Policy: Password standards
- Zero Trust Architecture: Security model
16. Review & Approval 🔴 REQUIRED
Approval Signatures
| Role | Name | Date | Signature |
|---|---|---|---|
| CTO | _______ | _______ | ___________ |
| Security Officer | _______ | _______ | ___________ |
| Lead Architect | _______ | _______ | ___________ |
| Compliance Manager | _______ | _______ | ___________ |
Review History
| Version | Date | Reviewer | Status | Comments |
|---|---|---|---|---|
| 1.0.0 | 2025-08-30 | Initial | DRAFT | Original version |
| 2.0.0 | 2025-09-01 | SESSION4 | DRAFT | Complete rewrite to v4.2 |
Approval Workflow
17. Appendix
17.1 Glossary
| Term | Definition |
|---|---|
| JWT | JSON Web Token - Self-contained token with claims |
| RBAC | Role-Based Access Control - Permission model |
| SSO | Single Sign-On - One login for multiple systems |
| OAuth2 | Authorization framework for delegated access |
| SAML | Security Assertion Markup Language for SSO |
| MFA | Multi-Factor Authentication - Multiple proofs |
| Argon2 | Password hashing algorithm winner of PHC |
| Claims | Statements about an entity in JWT |
| Tenant | Isolated customer organization |
| IdP | Identity Provider for SSO |
17.2 Permission Model Example
Organization: ACME Corp (tenant_id: 123)
├── Admin Role
│ ├── user:*
│ ├── project:*
│ └── billing:*
├── Developer Role
│ ├── project:read
│ ├── project:write
│ └── code:*
└── Viewer Role
├── project:read
└── dashboard:read
17.3 Token Example
{
"header": {
"alg": "RS256",
"typ": "JWT"
},
"payload": {
"sub": "user-123",
"user_id": "550e8400-e29b-41d4-a716-446655440000",
"tenant_id": "660e8400-e29b-41d4-a716-446655440001",
"email": "user@example.com",
"roles": ["developer"],
"permissions": ["project:read", "project:write"],
"actor_type": "human",
"exp": 1693526400,
"iat": 1693522800,
"iss": "coditect.com"
}
}
18. QA Review Block
Status: AWAITING INDEPENDENT QA REVIEW
This section will be completed by an independent QA reviewer (not the author) according to ADR-QA-REVIEW-GUIDE-v4.2.
Document ready for review as of: 2025-09-01
Version ready for review: 2.0.0
Next: See Part 2: Technical Implementation for complete implementation details.