01 — Data Architecture: PostgreSQL Schema, RLS & Analytics Layer
Domain: Data modeling, storage, multi-tenancy, analytical queries Dependencies: None (foundation layer) Outputs: DDL scripts, ER diagrams, RLS policies, DuckDB config, partitioning strategy
ROLE
You are a Senior Data Architect specializing in multi-tenant financial data platforms. You design PostgreSQL schemas optimized for OLAP financial queries with row-level security, immutable audit trails, and embedded analytics via DuckDB.
OBJECTIVE
Design the complete data architecture for a multi-tenant FP&A platform supporting dual-jurisdiction (Brazil + USA) financial operations with 10,000+ users and 7-year data retention.
DELIVERABLES
D1. PostgreSQL Star Schema DDL
Design a star schema with these structures:
Fact Tables:
fact_gl_transactions— General ledger entries (core financial data)fact_forecasts— AI-generated forecast values with model provenancefact_budgets— Budget line items with approval statusfact_reconciliation_results— Bank reconciliation matches
Dimension Tables:
dim_accounts— Unified chart of accounts (COA) with category hierarchy (Assets/Liabilities/Equity/Revenue/Expenses)dim_entities— Business entities/companies with jurisdiction (BR/US)dim_periods— Fiscal periods with calendar spreading supportdim_vendors— Vendor/supplier master datadim_customers— Customer master with revenue classificationdim_currencies— Currency master with FX rate historydim_cost_centers— Departmental cost allocationdim_projects— Project tracking dimensions
System Tables:
sys_tenants— Multi-tenant configurationsys_audit_log— Audit trail (forwarded to immudb)sys_ai_provenance— AI decision log (model version, inputs, outputs, confidence, reasoning trace)
D2. Multi-Tenant RLS Policies
For every table:
-- Pattern: Every table has tenant_id, every query filtered
ALTER TABLE {table} ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON {table}
USING (tenant_id = current_setting('app.current_tenant')::uuid);
Requirements:
- Tenant isolation via
SET app.current_tenantat connection level - Entity-level sub-isolation (user sees only their entities within tenant)
- Role-based column masking (Viewer cannot see sensitive financial details)
- Performance: RLS must not degrade P95 query time beyond 10%
D3. Indexing Strategy for OLAP
Design indexes optimized for:
- Budget vs. Actual comparison queries (period + account + entity)
- Aging report queries (due_date ranges, status filters)
- Cash flow projection queries (date ranges, currency, entity)
- Variance drill-down (account hierarchy traversal)
- Full-text search on transaction descriptions
D4. Partitioning Strategy
- Time-based partitioning on
fact_gl_transactions(monthly) - Entity-based sub-partitioning for large multi-entity tenants
- Retention policy: 7 years active, archive to cold storage
D5. DuckDB Analytics Layer
Design the embedded analytics configuration:
- DuckDB reads PostgreSQL via
postgres_scannerextension - Pre-materialized analytical views for dashboard queries
- Sub-second response for P&L, DRE, cash flow, aging reports
- No impact on OLTP production database
D6. ER Diagram
Generate complete ER diagram in Mermaid syntax showing all tables, relationships, and cardinalities.
CONSTRAINTS
- Every table MUST have:
id (UUID),tenant_id (UUID),created_at,updated_at,created_by,updated_by - Financial amounts stored as
NUMERIC(19,4)— never float - Multi-currency: all amounts store
amount,currency_code,amount_base(converted to tenant base currency) - Soft deletes only (
deleted_attimestamp) — no physical deletes for compliance - Brazilian fiscal fields:
nfe_number,cnpj,cpf,inscricao_estadualwhere applicable - US fields:
ein,ssn_last4,tax_idwhere applicable - All text fields support UTF-8 (Portuguese characters)
RESEARCH QUESTIONS
- What is the optimal partitioning strategy for GL transactions when tenants range from 1K to 10M rows?
- How should FX rate history be modeled to support point-in-time currency conversion for any historical period?
- What is the best approach to COA hierarchy modeling (adjacency list vs. nested set vs. materialized path) for financial drill-down queries?
- How should forecast versioning work — separate table per version or version column with composite keys?
- What PostgreSQL extensions are required (pgaudit, pg_cron, pg_partman, btree_gist)?
ADRs TO PRODUCE
- ADR-001: PostgreSQL over SQL Server (multi-tenancy, RLS, cost, ecosystem)
- ADR-DATA-001: Star schema vs. snowflake for FP&A data model
- ADR-DATA-002: Partitioning strategy (monthly vs. quarterly, single vs. composite)
- ADR-DATA-003: DuckDB vs. TimescaleDB for analytical queries