01 — Data Architecture: PostgreSQL Schema, RLS & Analytics Layer

Domain: Data modeling, storage, multi-tenancy, analytical queries Dependencies: None (foundation layer) Outputs: DDL scripts, ER diagrams, RLS policies, DuckDB config, partitioning strategy

ROLE

You are a Senior Data Architect specializing in multi-tenant financial data platforms. You design PostgreSQL schemas optimized for OLAP financial queries with row-level security, immutable audit trails, and embedded analytics via DuckDB.

OBJECTIVE

Design the complete data architecture for a multi-tenant FP&A platform supporting dual-jurisdiction (Brazil + USA) financial operations with 10,000+ users and 7-year data retention.

DELIVERABLES

D1. PostgreSQL Star Schema DDL

Design a star schema with these structures:

Fact Tables:

fact_gl_transactions — General ledger entries (core financial data)
fact_forecasts — AI-generated forecast values with model provenance
fact_budgets — Budget line items with approval status
fact_reconciliation_results — Bank reconciliation matches

Dimension Tables:

dim_accounts — Unified chart of accounts (COA) with category hierarchy (Assets/Liabilities/Equity/Revenue/Expenses)
dim_entities — Business entities/companies with jurisdiction (BR/US)
dim_periods — Fiscal periods with calendar spreading support
dim_vendors — Vendor/supplier master data
dim_customers — Customer master with revenue classification
dim_currencies — Currency master with FX rate history
dim_cost_centers — Departmental cost allocation
dim_projects — Project tracking dimensions

System Tables:

sys_tenants — Multi-tenant configuration
sys_audit_log — Audit trail (forwarded to immudb)
sys_ai_provenance — AI decision log (model version, inputs, outputs, confidence, reasoning trace)

D2. Multi-Tenant RLS Policies

For every table:

-- Pattern: Every table has tenant_id, every query filtered
ALTER TABLE {table} ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON {table}
  USING (tenant_id = current_setting('app.current_tenant')::uuid);

Requirements:

Tenant isolation via SET app.current_tenant at connection level
Entity-level sub-isolation (user sees only their entities within tenant)
Role-based column masking (Viewer cannot see sensitive financial details)
Performance: RLS must not degrade P95 query time beyond 10%

D3. Indexing Strategy for OLAP

Design indexes optimized for:

Budget vs. Actual comparison queries (period + account + entity)
Aging report queries (due_date ranges, status filters)
Cash flow projection queries (date ranges, currency, entity)
Variance drill-down (account hierarchy traversal)
Full-text search on transaction descriptions

D4. Partitioning Strategy

Time-based partitioning on fact_gl_transactions (monthly)
Entity-based sub-partitioning for large multi-entity tenants
Retention policy: 7 years active, archive to cold storage

D5. DuckDB Analytics Layer

Design the embedded analytics configuration:

DuckDB reads PostgreSQL via postgres_scanner extension
Pre-materialized analytical views for dashboard queries
Sub-second response for P&L, DRE, cash flow, aging reports
No impact on OLTP production database

D6. ER Diagram

Generate complete ER diagram in Mermaid syntax showing all tables, relationships, and cardinalities.

CONSTRAINTS

Every table MUST have: id (UUID), tenant_id (UUID), created_at, updated_at, created_by, updated_by
Financial amounts stored as NUMERIC(19,4) — never float
Multi-currency: all amounts store amount, currency_code, amount_base (converted to tenant base currency)
Soft deletes only (deleted_at timestamp) — no physical deletes for compliance
Brazilian fiscal fields: nfe_number, cnpj, cpf, inscricao_estadual where applicable
US fields: ein, ssn_last4, tax_id where applicable
All text fields support UTF-8 (Portuguese characters)

RESEARCH QUESTIONS

What is the optimal partitioning strategy for GL transactions when tenants range from 1K to 10M rows?
How should FX rate history be modeled to support point-in-time currency conversion for any historical period?
What is the best approach to COA hierarchy modeling (adjacency list vs. nested set vs. materialized path) for financial drill-down queries?
How should forecast versioning work — separate table per version or version column with composite keys?
What PostgreSQL extensions are required (pgaudit, pg_cron, pg_partman, btree_gist)?

ADRs TO PRODUCE

ADR-001: PostgreSQL over SQL Server (multi-tenancy, RLS, cost, ecosystem)
ADR-DATA-001: Star schema vs. snowflake for FP&A data model
ADR-DATA-002: Partitioning strategy (monthly vs. quarterly, single vs. composite)
ADR-DATA-003: DuckDB vs. TimescaleDB for analytical queries

ROLE​

OBJECTIVE​

DELIVERABLES​

D1. PostgreSQL Star Schema DDL​

D2. Multi-Tenant RLS Policies​

D3. Indexing Strategy for OLAP​

D4. Partitioning Strategy​

D5. DuckDB Analytics Layer​

D6. ER Diagram​

CONSTRAINTS​

RESEARCH QUESTIONS​

ADRs TO PRODUCE​