Skip to main content

Research Prompts: Tier 1 (Highest Priority)

Overview

These prompts are designed for deep research using reasoning models (Claude Opus, DeepSeek-R1, GPT-o1). Each prompt is self-contained, focused, and optimized for 15-30 minute research sessions.

Execution Strategy: Run each prompt independently. Synthesize outputs into CODITECT technical specifications.


PROMPT 1: LangGraph Agent Architecture for Financial Workflows

Context

CODITECT needs a robust agent orchestration framework for deterministic financial workflows that require compliance checkpoints and audit trails.

Research Objective

Design a production-ready LangGraph architecture for FP&A variance analysis with built-in compliance gates.

Prompt

You are a Principal AI Engineer specializing in agentic systems for regulated financial environments. 

**TASK**: Design a complete LangGraph workflow architecture for automated Budget-vs-Actual (BvA) variance analysis.

**REQUIREMENTS**:

1. **Workflow Nodes**:
- Data Retrieval: Fetch actuals from PostgreSQL GL tables
- Budget Comparison: Compare against approved budget by dimension (department/product/region)
- Variance Detection: Identify variances exceeding configurable thresholds (% and $)
- Root Cause Analysis: Use LLM to explain variance with transaction-level drill-down
- Narrative Generation: Produce CFO-ready variance memo
- Alert Dispatch: Send notifications via Apprise (Slack/email)

2. **Compliance Checkpoints** (mandatory):
- Human approval required for variance explanations affecting regulated entities
- Audit log entry for every state transition
- Model version tracking for all LLM calls
- Confidence score threshold (reject if <0.85)

3. **Technical Constraints**:
- Use LangGraph's StateGraph with TypedDict state schema
- Implement conditional edges for threshold-based routing
- Support parallel node execution for multi-entity analysis
- Include retry logic with exponential backoff
- Token budget awareness (stop at 95% of allocation)

4. **Deliverables**:
- Complete Python implementation with type hints
- State schema definition with validation
- Mermaid diagram of workflow graph
- OpenAPI 3.0 spec for external API endpoints
- Unit test scaffolding with pytest fixtures

**OUTPUT FORMAT**: Production-ready code with inline documentation. Include ADR (Architecture Decision Record) explaining key design choices.

**CONSTRAINTS**:
- Only use Apache 2.0 or MIT licensed libraries
- Must support local LLM deployment (Ollama/vLLM) as alternative to API
- All AI outputs must be logged to immudb for audit compliance

Expected Output

  • 500-800 lines of production Python code
  • Mermaid workflow diagram
  • OpenAPI specification
  • ADR document

CODITECT Application

This becomes the core workflow engine template for all CODITECT agent orchestrations.


PROMPT 2: OpenFGA Authorization Model for Multi-Tenant Financial Data

Context

CODITECT requires a relationship-based authorization system that supports complex financial hierarchies (org → entity → department → budget).

Research Objective

Design an OpenFGA authorization model for multi-tenant FP&A with separation of duties enforcement.

Prompt

You are a Security Architect specializing in policy-as-code authorization for financial systems requiring SOX compliance.

**TASK**: Design a complete OpenFGA authorization model for multi-tenant FP&A data access control.

**REQUIREMENTS**:

1. **Authorization Model (DSL)**:
- Organization hierarchy: tenant → entity → department → cost_center
- Role definitions: CFO, Controller, FP&A Analyst, Viewer, AI Service Account
- Separation of duties: Analyst cannot approve, Approver cannot edit
- Just-in-Time access for external auditors (time-bounded)
- Cross-entity permissions for consolidation users

2. **Specific Authorization Rules**:
- User can VIEW budget if: member of owning entity OR has cross-entity role
- User can EDIT budget if: member AND role in (FPA_ANALYST, CONTROLLER) AND budget.status != 'approved'
- User can APPROVE budget if: role = CFO AND budget.amount > $50k AND user.entity = budget.entity
- AI agent can READ GL if: service_account AND scope includes 'gl:read'
- AI agent can WRITE forecast if: service_account AND scope includes 'forecast:write' AND NOT 'gl:delete'

3. **Compliance Requirements**:
- All permission checks logged with timestamp, user, resource, action, result
- Quarterly permission review automation (detect stale access)
- Emergency access workflow with time-limited escalation
- Audit export in SOC 2 format

4. **Deliverables**:
- OpenFGA authorization model DSL
- Relationship tuple examples for test scenarios
- FastAPI middleware implementation (Python)
- Permission check API with batch support
- Integration tests covering all authorization rules
- Migration script from .NET Identity/Claims-based auth

**OUTPUT FORMAT**: Complete implementation files with comprehensive test coverage.

**CONSTRAINTS**:
- Must support 10k+ users with <50ms p99 permission check latency
- All policies version-controlled in Git
- Support for "dry-run" mode to test policy changes before deployment

Expected Output

  • OpenFGA DSL authorization model
  • Python middleware code
  • Test scenarios covering edge cases
  • Migration scripts

CODITECT Application

Replaces custom RBAC across all CODITECT deployments; provides SOC 2-compliant audit trail.


PROMPT 3: Cryptographic Audit Trail with immudb Integration

Context

FDA 21 CFR Part 11 and SOX require tamper-evident audit trails with cryptographic verification capability.

Research Objective

Design an immudb integration layer for CODITECT that provides cryptographic proof of non-tampering.

Prompt

You are a Compliance Engineer specializing in cryptographic audit systems for FDA and SOX regulated environments.

**TASK**: Design a complete immudb integration for tamper-evident audit trails with PostgreSQL synchronization.

**REQUIREMENTS**:

1. **Audit Event Schema**:
- audit_id (ULID for time-ordered uniqueness)
- timestamp (microsecond precision)
- user_id / service_account_id
- action (CREATE/READ/UPDATE/DELETE/APPROVE/REJECT)
- resource_type (budget/forecast/gl_entry/report)
- resource_id
- previous_state (JSONB hash)
- new_state (JSONB hash)
- ai_model_version (if AI-generated)
- reasoning_trace (if AI-generated)
- ip_address / user_agent
- verification_hash (Merkle root link)

2. **Integration Architecture**:
- PostgreSQL trigger captures all changes to regulated tables
- pgaudit extension logs session-level operations
- Event forwarder pushes to immudb within 100ms
- immudb provides cryptographic verification API
- DuckDB materialized view for analytics queries

3. **Verification Capabilities**:
- Third-party auditor can verify any audit entry independently
- Merkle tree proof exportable as PDF for SOX auditors
- Alert on any detected tampering attempt
- Full audit trail reconstruction from any point in time

4. **Retention & Performance**:
- 7-year retention (SOX requirement)
- 100k+ events/day ingestion capacity
- <100ms verification latency
- Compressed cold storage for events >1 year old

5. **Deliverables**:
- PostgreSQL trigger functions (PL/pgSQL)
- Python immudb client wrapper with retry logic
- Verification API (FastAPI)
- DuckDB analytics queries for compliance reports
- Docker Compose with all components
- Integration test suite with tampering simulation

**OUTPUT FORMAT**: Production deployment package with comprehensive documentation.

**CONSTRAINTS**:
- Must work in air-gapped environments (no external API calls)
- Support for multi-tenant isolation (tenant cannot access other tenants' audit logs)
- Encryption at rest and in transit

Expected Output

  • PostgreSQL trigger implementation
  • immudb integration code
  • Verification API
  • Docker deployment package
  • Compliance report templates

CODITECT Application

Universal audit layer for FDA 21 CFR Part 11, HIPAA, and SOX compliance across all CODITECT modules.


PROMPT 4: DeepSeek-R1 Local LLM Deployment for Regulated Environments

Context

Regulated industries cannot send financial data to external AI APIs. CODITECT needs air-gapped LLM capability.

Research Objective

Design a production deployment architecture for DeepSeek-R1 (or equivalent) that runs entirely on-premises.

Prompt

You are a Machine Learning Infrastructure Engineer specializing in LLM deployment for regulated financial environments.

**TASK**: Design a complete local LLM deployment architecture for financial reasoning tasks.

**REQUIREMENTS**:

1. **Model Selection & Optimization**:
- Primary: DeepSeek-R1-Distill-Qwen-32B (optimal reasoning/cost ratio)
- Fallback: Llama 3.1 70B (Apache license)
- Quantization: AWQ/GPTQ for memory efficiency
- Context window: 32k tokens minimum

2. **Inference Infrastructure**:
- vLLM or TGI (Text Generation Inference) serving
- Multi-GPU support (tensor parallelism)
- Batched inference for throughput
- Request queuing with priority levels
- Health checks and automatic restart

3. **Financial Task Optimization**:
- Fine-tuning dataset: 50k+ financial Q&A pairs
- LoRA adapters for task-specific performance
- Prompt templates for variance analysis, forecasting, reconciliation
- Confidence scoring calibration
- Hallucination detection heuristics

4. **Compliance Integration**:
- All prompts and responses logged to immudb
- Model version tracked with each inference
- Token usage metering per tenant
- Data residency enforcement (no external calls)

5. **Performance Targets**:
- <5s latency for 500-token response
- 95%+ accuracy on financial reasoning benchmarks
- 1000 requests/hour capacity on single 8xH100 node
- Zero data leakage to external services

6. **Deliverables**:
- Kubernetes deployment manifests (StatefulSet)
- Helm chart with configurable values
- vLLM/TGI configuration files
- FastAPI wrapper with OpenAI-compatible API
- Benchmark suite for financial tasks
- Cost calculator (tokens → GPU hours → $)
- Operational runbook (scaling, monitoring, incident response)

**OUTPUT FORMAT**: Production-ready deployment package with operational documentation.

**CONSTRAINTS**:
- Must run on commodity hardware (A100/H100 GPUs)
- Support for Kubernetes and Docker Compose deployments
- No external API dependencies (air-gapped capable)

Expected Output

  • Kubernetes manifests and Helm chart
  • vLLM/TGI configuration
  • FastAPI wrapper with OpenAI compatibility
  • Benchmark scripts
  • Operational documentation

CODITECT Application

Enables "AI-native" positioning for regulated industries without cloud AI dependency.


PROMPT 5: Airbyte Universal ERP Connector Hub

Context

CODITECT needs to ingest data from 50+ ERP/accounting systems without building custom connectors.

Research Objective

Design an Airbyte-based connector hub with COA (Chart of Accounts) harmonization layer.

Prompt

You are a Data Integration Architect specializing in financial data pipelines for multi-ERP environments.

**TASK**: Design a universal ERP connector hub using Airbyte with automated COA harmonization.

**REQUIREMENTS**:

1. **Core Connectors (Tier 1)**:
- QuickBooks Online/Desktop
- Xero
- NetSuite (SuiteScript/SuiteQL)
- SAP S/4HANA (OData)
- Microsoft Dynamics 365 (Dataverse)
- Sage Intacct

2. **Brazilian ERPs (Tier 1 for LATAM)**:
- Totvs Protheus
- Omie
- Conta Azul
- Tactus

3. **COA Harmonization Layer**:
- ML-based account classification (Assets/Liabilities/Equity/Revenue/Expenses)
- Configurable mapping rules per tenant
- Semantic matching for unknown accounts
- Human-in-the-loop for ambiguous mappings
- Version-controlled mapping history

4. **Data Quality**:
- Automatic data validation rules
- Duplicate detection
- Missing data alerting
- Data freshness SLA monitoring

5. **dbt Integration**:
- Staging models for each source system
- Intermediate models for standardization
- Mart models for unified GL view
- Tests for referential integrity
- Documentation generation

6. **Deliverables**:
- Airbyte deployment (Kubernetes + Helm)
- Custom connector builder workflow (for proprietary ERPs)
- dbt project structure with example models
- COA mapping UI specification (React)
- Data quality dashboard (Streamlit)
- Runbook for adding new connectors

**OUTPUT FORMAT**: Complete implementation package with step-by-step setup guide.

**CONSTRAINTS**:
- All connectors must support incremental sync (CDC where available)
- Support for multi-tenant data isolation
- Compliance with LGPD/GDPR (data residency options)

Expected Output

  • Airbyte Kubernetes deployment
  • dbt project with models
  • COA mapping logic
  • Data quality monitoring
  • Setup documentation

CODITECT Application

"Universal connector" becomes a key differentiator vs. competitors with limited integrations.


Execution Checklist

PromptPriorityEst. TimeDependencyOwner
PROMPT 1: LangGraph WorkflowP04 hrsNoneAI Team
PROMPT 2: OpenFGA RBACP03 hrsNoneSecurity Team
PROMPT 3: immudb AuditP03 hrsPROMPT 2Compliance Team
PROMPT 4: Local LLMP05 hrsNoneMLOps Team
PROMPT 5: Airbyte HubP14 hrsNoneData Team

Total Research Investment: ~19 hours Expected Output: 5 production-ready technical specifications