Research Prompts: Tier 1 (Highest Priority)

Overview

These prompts are designed for deep research using reasoning models (Claude Opus, DeepSeek-R1, GPT-o1). Each prompt is self-contained, focused, and optimized for 15-30 minute research sessions.

Execution Strategy: Run each prompt independently. Synthesize outputs into CODITECT technical specifications.

PROMPT 1: LangGraph Agent Architecture for Financial Workflows

Context

CODITECT needs a robust agent orchestration framework for deterministic financial workflows that require compliance checkpoints and audit trails.

Research Objective

Design a production-ready LangGraph architecture for FP&A variance analysis with built-in compliance gates.

Prompt

You are a Principal AI Engineer specializing in agentic systems for regulated financial environments. 

**TASK**: Design a complete LangGraph workflow architecture for automated Budget-vs-Actual (BvA) variance analysis.

**REQUIREMENTS**:

1. **Workflow Nodes**:
   - Data Retrieval: Fetch actuals from PostgreSQL GL tables
   - Budget Comparison: Compare against approved budget by dimension (department/product/region)
   - Variance Detection: Identify variances exceeding configurable thresholds (% and $)
   - Root Cause Analysis: Use LLM to explain variance with transaction-level drill-down
   - Narrative Generation: Produce CFO-ready variance memo
   - Alert Dispatch: Send notifications via Apprise (Slack/email)

2. **Compliance Checkpoints** (mandatory):
   - Human approval required for variance explanations affecting regulated entities
   - Audit log entry for every state transition
   - Model version tracking for all LLM calls
   - Confidence score threshold (reject if <0.85)

3. **Technical Constraints**:
   - Use LangGraph's StateGraph with TypedDict state schema
   - Implement conditional edges for threshold-based routing
   - Support parallel node execution for multi-entity analysis
   - Include retry logic with exponential backoff
   - Token budget awareness (stop at 95% of allocation)

4. **Deliverables**:
   - Complete Python implementation with type hints
   - State schema definition with validation
   - Mermaid diagram of workflow graph
   - OpenAPI 3.0 spec for external API endpoints
   - Unit test scaffolding with pytest fixtures

**OUTPUT FORMAT**: Production-ready code with inline documentation. Include ADR (Architecture Decision Record) explaining key design choices.

**CONSTRAINTS**: 
- Only use Apache 2.0 or MIT licensed libraries
- Must support local LLM deployment (Ollama/vLLM) as alternative to API
- All AI outputs must be logged to immudb for audit compliance

Expected Output

500-800 lines of production Python code
Mermaid workflow diagram
OpenAPI specification
ADR document

CODITECT Application

This becomes the core workflow engine template for all CODITECT agent orchestrations.

PROMPT 2: OpenFGA Authorization Model for Multi-Tenant Financial Data

Context

CODITECT requires a relationship-based authorization system that supports complex financial hierarchies (org → entity → department → budget).

Research Objective

Design an OpenFGA authorization model for multi-tenant FP&A with separation of duties enforcement.

Prompt

You are a Security Architect specializing in policy-as-code authorization for financial systems requiring SOX compliance.

**TASK**: Design a complete OpenFGA authorization model for multi-tenant FP&A data access control.

**REQUIREMENTS**:

1. **Authorization Model (DSL)**:
   - Organization hierarchy: tenant → entity → department → cost_center
   - Role definitions: CFO, Controller, FP&A Analyst, Viewer, AI Service Account
   - Separation of duties: Analyst cannot approve, Approver cannot edit
   - Just-in-Time access for external auditors (time-bounded)
   - Cross-entity permissions for consolidation users

2. **Specific Authorization Rules**:
   - User can VIEW budget if: member of owning entity OR has cross-entity role
   - User can EDIT budget if: member AND role in (FPA_ANALYST, CONTROLLER) AND budget.status != 'approved'
   - User can APPROVE budget if: role = CFO AND budget.amount > $50k AND user.entity = budget.entity
   - AI agent can READ GL if: service_account AND scope includes 'gl:read'
   - AI agent can WRITE forecast if: service_account AND scope includes 'forecast:write' AND NOT 'gl:delete'

3. **Compliance Requirements**:
   - All permission checks logged with timestamp, user, resource, action, result
   - Quarterly permission review automation (detect stale access)
   - Emergency access workflow with time-limited escalation
   - Audit export in SOC 2 format

4. **Deliverables**:
   - OpenFGA authorization model DSL
   - Relationship tuple examples for test scenarios
   - FastAPI middleware implementation (Python)
   - Permission check API with batch support
   - Integration tests covering all authorization rules
   - Migration script from .NET Identity/Claims-based auth

**OUTPUT FORMAT**: Complete implementation files with comprehensive test coverage.

**CONSTRAINTS**:
- Must support 10k+ users with <50ms p99 permission check latency
- All policies version-controlled in Git
- Support for "dry-run" mode to test policy changes before deployment

Expected Output

OpenFGA DSL authorization model
Python middleware code
Test scenarios covering edge cases
Migration scripts

CODITECT Application

Replaces custom RBAC across all CODITECT deployments; provides SOC 2-compliant audit trail.

PROMPT 3: Cryptographic Audit Trail with immudb Integration

Context

FDA 21 CFR Part 11 and SOX require tamper-evident audit trails with cryptographic verification capability.

Research Objective

Design an immudb integration layer for CODITECT that provides cryptographic proof of non-tampering.

Prompt

You are a Compliance Engineer specializing in cryptographic audit systems for FDA and SOX regulated environments.

**TASK**: Design a complete immudb integration for tamper-evident audit trails with PostgreSQL synchronization.

**REQUIREMENTS**:

1. **Audit Event Schema**:
   - audit_id (ULID for time-ordered uniqueness)
   - timestamp (microsecond precision)
   - user_id / service_account_id
   - action (CREATE/READ/UPDATE/DELETE/APPROVE/REJECT)
   - resource_type (budget/forecast/gl_entry/report)
   - resource_id
   - previous_state (JSONB hash)
   - new_state (JSONB hash)
   - ai_model_version (if AI-generated)
   - reasoning_trace (if AI-generated)
   - ip_address / user_agent
   - verification_hash (Merkle root link)

2. **Integration Architecture**:
   - PostgreSQL trigger captures all changes to regulated tables
   - pgaudit extension logs session-level operations
   - Event forwarder pushes to immudb within 100ms
   - immudb provides cryptographic verification API
   - DuckDB materialized view for analytics queries

3. **Verification Capabilities**:
   - Third-party auditor can verify any audit entry independently
   - Merkle tree proof exportable as PDF for SOX auditors
   - Alert on any detected tampering attempt
   - Full audit trail reconstruction from any point in time

4. **Retention & Performance**:
   - 7-year retention (SOX requirement)
   - 100k+ events/day ingestion capacity
   - <100ms verification latency
   - Compressed cold storage for events >1 year old

5. **Deliverables**:
   - PostgreSQL trigger functions (PL/pgSQL)
   - Python immudb client wrapper with retry logic
   - Verification API (FastAPI)
   - DuckDB analytics queries for compliance reports
   - Docker Compose with all components
   - Integration test suite with tampering simulation

**OUTPUT FORMAT**: Production deployment package with comprehensive documentation.

**CONSTRAINTS**:
- Must work in air-gapped environments (no external API calls)
- Support for multi-tenant isolation (tenant cannot access other tenants' audit logs)
- Encryption at rest and in transit

Expected Output

PostgreSQL trigger implementation
immudb integration code
Verification API
Docker deployment package
Compliance report templates

CODITECT Application

Universal audit layer for FDA 21 CFR Part 11, HIPAA, and SOX compliance across all CODITECT modules.

PROMPT 4: DeepSeek-R1 Local LLM Deployment for Regulated Environments

Context

Regulated industries cannot send financial data to external AI APIs. CODITECT needs air-gapped LLM capability.

Research Objective

Design a production deployment architecture for DeepSeek-R1 (or equivalent) that runs entirely on-premises.

Prompt

You are a Machine Learning Infrastructure Engineer specializing in LLM deployment for regulated financial environments.

**TASK**: Design a complete local LLM deployment architecture for financial reasoning tasks.

**REQUIREMENTS**:

1. **Model Selection & Optimization**:
   - Primary: DeepSeek-R1-Distill-Qwen-32B (optimal reasoning/cost ratio)
   - Fallback: Llama 3.1 70B (Apache license)
   - Quantization: AWQ/GPTQ for memory efficiency
   - Context window: 32k tokens minimum

2. **Inference Infrastructure**:
   - vLLM or TGI (Text Generation Inference) serving
   - Multi-GPU support (tensor parallelism)
   - Batched inference for throughput
   - Request queuing with priority levels
   - Health checks and automatic restart

3. **Financial Task Optimization**:
   - Fine-tuning dataset: 50k+ financial Q&A pairs
   - LoRA adapters for task-specific performance
   - Prompt templates for variance analysis, forecasting, reconciliation
   - Confidence scoring calibration
   - Hallucination detection heuristics

4. **Compliance Integration**:
   - All prompts and responses logged to immudb
   - Model version tracked with each inference
   - Token usage metering per tenant
   - Data residency enforcement (no external calls)

5. **Performance Targets**:
   - <5s latency for 500-token response
   - 95%+ accuracy on financial reasoning benchmarks
   - 1000 requests/hour capacity on single 8xH100 node
   - Zero data leakage to external services

6. **Deliverables**:
   - Kubernetes deployment manifests (StatefulSet)
   - Helm chart with configurable values
   - vLLM/TGI configuration files
   - FastAPI wrapper with OpenAI-compatible API
   - Benchmark suite for financial tasks
   - Cost calculator (tokens → GPU hours → $)
   - Operational runbook (scaling, monitoring, incident response)

**OUTPUT FORMAT**: Production-ready deployment package with operational documentation.

**CONSTRAINTS**:
- Must run on commodity hardware (A100/H100 GPUs)
- Support for Kubernetes and Docker Compose deployments
- No external API dependencies (air-gapped capable)

Expected Output

Kubernetes manifests and Helm chart
vLLM/TGI configuration
FastAPI wrapper with OpenAI compatibility
Benchmark scripts
Operational documentation

CODITECT Application

Enables "AI-native" positioning for regulated industries without cloud AI dependency.

PROMPT 5: Airbyte Universal ERP Connector Hub

Context

CODITECT needs to ingest data from 50+ ERP/accounting systems without building custom connectors.

Research Objective

Design an Airbyte-based connector hub with COA (Chart of Accounts) harmonization layer.

Prompt

You are a Data Integration Architect specializing in financial data pipelines for multi-ERP environments.

**TASK**: Design a universal ERP connector hub using Airbyte with automated COA harmonization.

**REQUIREMENTS**:

1. **Core Connectors (Tier 1)**:
   - QuickBooks Online/Desktop
   - Xero
   - NetSuite (SuiteScript/SuiteQL)
   - SAP S/4HANA (OData)
   - Microsoft Dynamics 365 (Dataverse)
   - Sage Intacct

2. **Brazilian ERPs (Tier 1 for LATAM)**:
   - Totvs Protheus
   - Omie
   - Conta Azul
   - Tactus

3. **COA Harmonization Layer**:
   - ML-based account classification (Assets/Liabilities/Equity/Revenue/Expenses)
   - Configurable mapping rules per tenant
   - Semantic matching for unknown accounts
   - Human-in-the-loop for ambiguous mappings
   - Version-controlled mapping history

4. **Data Quality**:
   - Automatic data validation rules
   - Duplicate detection
   - Missing data alerting
   - Data freshness SLA monitoring

5. **dbt Integration**:
   - Staging models for each source system
   - Intermediate models for standardization
   - Mart models for unified GL view
   - Tests for referential integrity
   - Documentation generation

6. **Deliverables**:
   - Airbyte deployment (Kubernetes + Helm)
   - Custom connector builder workflow (for proprietary ERPs)
   - dbt project structure with example models
   - COA mapping UI specification (React)
   - Data quality dashboard (Streamlit)
   - Runbook for adding new connectors

**OUTPUT FORMAT**: Complete implementation package with step-by-step setup guide.

**CONSTRAINTS**:
- All connectors must support incremental sync (CDC where available)
- Support for multi-tenant data isolation
- Compliance with LGPD/GDPR (data residency options)

Expected Output

Airbyte Kubernetes deployment
dbt project with models
COA mapping logic
Data quality monitoring
Setup documentation

CODITECT Application

"Universal connector" becomes a key differentiator vs. competitors with limited integrations.

Execution Checklist

Prompt	Priority	Est. Time	Dependency	Owner
PROMPT 1: LangGraph Workflow	P0	4 hrs	None	AI Team
PROMPT 2: OpenFGA RBAC	P0	3 hrs	None	Security Team
PROMPT 3: immudb Audit	P0	3 hrs	PROMPT 2	Compliance Team
PROMPT 4: Local LLM	P0	5 hrs	None	MLOps Team
PROMPT 5: Airbyte Hub	P1	4 hrs	None	Data Team

Total Research Investment: ~19 hours Expected Output: 5 production-ready technical specifications

Overview​

PROMPT 1: LangGraph Agent Architecture for Financial Workflows​

Context​

Research Objective​

Prompt​

Expected Output​

CODITECT Application​

PROMPT 2: OpenFGA Authorization Model for Multi-Tenant Financial Data​

Context​

Research Objective​

Prompt​

Expected Output​

CODITECT Application​

PROMPT 3: Cryptographic Audit Trail with immudb Integration​

Context​

Research Objective​

Prompt​

Expected Output​

CODITECT Application​

PROMPT 4: DeepSeek-R1 Local LLM Deployment for Regulated Environments​

Context​

Research Objective​

Prompt​

Expected Output​

CODITECT Application​

PROMPT 5: Airbyte Universal ERP Connector Hub​

Context​

Research Objective​

Prompt​

Expected Output​

CODITECT Application​

Execution Checklist​

Overview

PROMPT 1: LangGraph Agent Architecture for Financial Workflows

Context

Research Objective

Prompt

Expected Output

CODITECT Application

PROMPT 2: OpenFGA Authorization Model for Multi-Tenant Financial Data

Context

Research Objective

Prompt

Expected Output

CODITECT Application

PROMPT 3: Cryptographic Audit Trail with immudb Integration

Context

Research Objective

Prompt

Expected Output

CODITECT Application

PROMPT 4: DeepSeek-R1 Local LLM Deployment for Regulated Environments

Context

Research Objective

Prompt

Expected Output

CODITECT Application

PROMPT 5: Airbyte Universal ERP Connector Hub

Context

Research Objective

Prompt

Expected Output

CODITECT Application

Execution Checklist