Research Prompts: Tier 2 (Secondary Priority)
Overview
These prompts address important but not critical capabilities. Execute after Tier 1 research is complete.
PROMPT 6: NeuralProphet Cash Flow Forecasting Pipeline
Context
Accurate cash flow forecasting is the #1 request from CFOs. NeuralProphet offers 55-92% accuracy improvement over Prophet.
Research Objective
Design a production forecasting pipeline with confidence intervals and scenario modeling.
Prompt
You are a Time-Series ML Engineer specializing in financial forecasting for FP&A applications.
**TASK**: Design a complete cash flow forecasting pipeline using NeuralProphet with explainability features.
**REQUIREMENTS**:
1. **Forecasting Model**:
- 13-week rolling cash flow forecast (standard treasury horizon)
- Seasonality detection (weekly, monthly, quarterly, annual)
- Holiday effects (configurable per country)
- Exogenous regressors: AR aging, AP aging, pipeline stage
- Confidence intervals: P10/P50/P90
2. **Training Pipeline**:
- Minimum 24 months historical data
- Automatic hyperparameter tuning (Optuna)
- Cross-validation with expanding window
- Model versioning with MLflow
- Retraining trigger on accuracy degradation
3. **Scenario Modeling**:
- Best case: Pipeline close rates +20%
- Base case: Historical patterns
- Worst case: Economic stress scenario
- User-defined custom scenarios
4. **Explainability**:
- Component breakdown (trend, seasonality, exogenous)
- Driver attribution: "Revenue forecast +15% due to Q4 seasonality"
- Confidence level explanation
- Data quality impact on forecast reliability
5. **Integration Points**:
- Input: PostgreSQL GL data, CRM pipeline, AR/AP aging
- Output: Forecast table with version tracking
- API: FastAPI endpoint for on-demand forecasting
- Visualization: Plotly interactive charts
6. **Deliverables**:
- Training pipeline (Python + Dagster)
- NeuralProphet model configuration
- Scenario engine with parameter injection
- FastAPI forecast service
- Streamlit dashboard for visualization
- Benchmark against Prophet/ARIMA baselines
**OUTPUT FORMAT**: Production pipeline with MLOps best practices.
**CONSTRAINTS**:
- Must handle missing data gracefully
- Support multi-entity forecasting (consolidation)
- Forecast generation <30s for 5-year history
CODITECT Application
"AI-powered forecasting" becomes a premium feature for FP&A automation pack.
PROMPT 7: Month-End Close Automation Workflow
Context
Month-end close takes 10-15 days at most companies. AI can reduce this to 3-5 days.
Research Objective
Design an agent-based month-end close workflow with automation and human checkpoints.
Prompt
You are a Finance Operations Architect specializing in month-end close optimization for mid-market companies.
**TASK**: Design a complete month-end close automation workflow using CODITECT agent patterns.
**REQUIREMENTS**:
1. **Close Process Steps** (in order):
- Pre-close checklist generation
- Bank reconciliation (auto-match statements to GL)
- Intercompany reconciliation
- Accrual calculations (rent, payroll, utilities)
- Revenue recognition review
- AP/AR cutoff verification
- Journal entry processing
- Trial balance generation
- Variance analysis
- Financial statement preparation
- Close certification
2. **Automation Targets**:
- Bank reconciliation: 95% auto-match (ML-based)
- Accrual calculations: Pattern-based with anomaly detection
- Journal entries: Auto-generated, human-approved
- Variance analysis: AI-generated explanations
3. **Compliance Gates** (mandatory human approval):
- Material adjustments >$10k
- Non-routine journal entries
- Revenue recognition judgments
- Intercompany eliminations
- Close certification sign-off
4. **Multi-Entity Coordination**:
- Subsidiary close → Parent consolidation
- Inter-entity dependency tracking
- Close status dashboard (entity x step matrix)
- Bottleneck identification and escalation
5. **Audit Trail**:
- Every step logged with timestamp
- Supporting documentation linkage
- AI decision reasoning captured
- Change history for all entries
6. **Deliverables**:
- LangGraph workflow definition
- Agent specifications for each automation task
- Checkpoint approval UI wireframes
- Close calendar Gantt chart generator
- KPI dashboard (days to close, automation rate)
- Playbook for implementing at new client
**OUTPUT FORMAT**: Complete implementation guide with deployment timeline.
**CONSTRAINTS**:
- Must integrate with existing GL systems (read-only initially)
- Parallel processing where dependencies allow
- Graceful degradation if automation fails (fallback to manual)
CODITECT Application
This becomes the flagship feature of the FP&A Automation Pack—tangible ROI demonstration.
PROMPT 8: Open Finance Brazil Integration
Context
BACEN mandates Open Finance participation for Brazilian financial institutions. This creates a connector opportunity.
Research Objective
Design a BACEN-compliant Open Finance integration for real-time bank data access.
Prompt
You are a Brazilian Fintech Architect specializing in Open Finance (Open Banking) integrations under BACEN regulation.
**TASK**: Design a complete Open Finance Brazil integration for FP&A data ingestion.
**REQUIREMENTS**:
1. **Open Finance Capabilities**:
- Consent management (request, track, revoke)
- Account balance retrieval (real-time)
- Transaction history (up to 12 months)
- Credit operations visibility
- Investment positions
- Payment initiation (Pix API)
2. **BACEN Compliance**:
- TPP (Third-Party Provider) registration requirements
- Consent flow per BACEN specs (OAuth 2.0 + FAPI)
- Data retention and deletion policies
- Security standards (mTLS, JWS signatures)
- Incident reporting procedures
3. **Integration Architecture**:
- OAuth 2.0 authorization server
- Webhook receiver for bank notifications
- Data normalization layer (bank-agnostic schema)
- Multi-bank aggregation view
- Rate limiting and retry logic
4. **FP&A Value-Add**:
- Automatic bank reconciliation from Open Finance data
- Cash position dashboard (multi-bank)
- Transaction categorization (ML-based)
- Fraud detection signals
- Treasury forecasting inputs
5. **Supported Banks** (Phase 1):
- Banco do Brasil
- Itaú Unibanco
- Bradesco
- Santander Brasil
- Caixa Econômica Federal
- Nubank, Inter, C6 (digital banks)
6. **Deliverables**:
- BACEN compliance checklist
- Authorization server implementation
- Bank connector adapters (per institution)
- Data model for normalized transactions
- Consent management UI
- Integration test suite with sandbox environments
**OUTPUT FORMAT**: Compliance-ready implementation with BACEN documentation.
**CONSTRAINTS**:
- Must support both legacy (COBOL) and modern bank APIs
- LGPD compliance for personal financial data
- Disaster recovery for financial data (RTO <4hrs)
CODITECT Application
Creates 12-18 month competitive moat in Brazilian market—no competitors have this integration.
PROMPT 9: PostgreSQL Multi-Tenant RLS Architecture
Context
SaaS financial platforms require bulletproof tenant isolation with excellent query performance.
Research Objective
Design a PostgreSQL multi-tenant architecture using Row-Level Security with performance optimization.
Prompt
You are a Database Architect specializing in multi-tenant SaaS platforms handling sensitive financial data.
**TASK**: Design a complete PostgreSQL multi-tenant architecture using Row-Level Security (RLS).
**REQUIREMENTS**:
1. **Tenant Isolation Model**:
- Schema: Shared schema, tenant_id column on all tables
- RLS policies enforcing tenant isolation at row level
- Connection pooling with tenant context injection
- Cross-tenant queries explicitly disabled (except admin)
2. **Performance Optimization**:
- Composite indexes including tenant_id
- Partial indexes for active tenants
- Table partitioning by tenant_id for large tables
- Connection pooler (PgBouncer/Pgcat) configuration
- Query plan caching considerations
3. **Financial Data Model**:
- Core tables: tenants, users, entities, accounts, gl_transactions, budgets, forecasts
- Audit tables with RLS
- Multi-currency support (base + transaction currency)
- Fiscal calendar flexibility (non-calendar years)
4. **Security Layers**:
- RLS policies (tenant isolation)
- Column-level encryption (PII fields)
- TDE (Transparent Data Encryption) at rest
- SSL/TLS in transit
- Audit logging (pgaudit)
5. **Operational Considerations**:
- Tenant onboarding automation
- Tenant data export/deletion (GDPR/LGPD)
- Performance monitoring per tenant
- Noisy neighbor detection and throttling
- Backup/restore isolation
6. **Deliverables**:
- Complete DDL with RLS policies
- Index strategy document
- Partitioning implementation
- Connection pooler configuration
- Tenant management API
- Performance benchmark results (10k users, 100 tenants)
- Migration scripts from single-tenant
**OUTPUT FORMAT**: Production DDL with operational documentation.
**CONSTRAINTS**:
- Target: 10k+ concurrent users, 1000+ tenants
- P95 query latency <100ms for dashboard queries
- Support for PostgreSQL 16+ features
CODITECT Application
Foundation for all CODITECT SaaS deployments—ensures data isolation for regulated customers.
PROMPT 10: Variance Analysis Natural Language Generation
Context
CFOs want AI to explain budget variances in plain English with actionable insights.
Research Objective
Design an NLG system that produces CFO-ready variance explanations.
Prompt
You are an AI/NLP Engineer specializing in natural language generation for financial reporting.
**TASK**: Design an NLG system for automated budget variance explanations.
**REQUIREMENTS**:
1. **Input Analysis**:
- Budget vs. actual by account/department/entity
- Variance amount ($ and %)
- Transaction-level drill-down data
- Historical variance patterns
- Business context (seasonality, known events)
2. **Explanation Components**:
- Headline: "Marketing exceeded budget by $50k (15%)"
- Root cause: "Driven by unplanned Q2 conference ($30k) + agency fees ($20k)"
- Trend context: "This is 3rd consecutive month of overspend"
- Benchmark: "Industry average marketing spend is 12% of revenue; we're at 18%"
- Recommendation: "Consider reallocating Q3 budget or seeking approval for increase"
3. **Tone Calibration**:
- CFO-level: Strategic, high-level, action-oriented
- Controller-level: Detailed, accounting-accurate, reference-heavy
- Board-level: Executive summary, key metrics only
4. **Quality Controls**:
- Factual accuracy validation against source data
- Numerical consistency checks
- Hallucination detection (claims not supported by data)
- Confidence scoring with threshold for human review
5. **Templates & Personalization**:
- Company-specific terminology injection
- Historical narrative style matching
- Configurable detail levels
- Multi-language support (English, Portuguese)
6. **Deliverables**:
- Prompt engineering templates for variance analysis
- LLM chain for data → insight → narrative
- Quality validation pipeline
- A/B testing framework for narrative quality
- User feedback collection for continuous improvement
- Integration with LangGraph workflow (PROMPT 1)
**OUTPUT FORMAT**: NLG system with quality assurance framework.
**CONSTRAINTS**:
- Must cite specific transactions in explanations
- No invented data or numbers
- Latency <10s for single variance explanation
- Support for batch processing (monthly report generation)
CODITECT Application
Differentiates CODITECT AI from competitors—explainability is the key to finance user trust.
PROMPT 11: Dagster Asset-Centric Data Orchestration
Context
Data lineage tracking is essential for audit compliance. Dagster provides superior lineage vs. Airflow.
Research Objective
Design a Dagster orchestration layer for FP&A data pipelines with full lineage tracking.
Prompt
You are a Data Engineering Architect specializing in modern data stack orchestration for regulated industries.
**TASK**: Design a Dagster-based data orchestration layer for FP&A pipelines.
**REQUIREMENTS**:
1. **Asset Definitions**:
- Raw: ERP extracts (QuickBooks, NetSuite, Xero, etc.)
- Staged: Cleaned, validated source data
- Intermediate: COA-harmonized transactions
- Marts: Unified GL, budget comparison, forecasts
- Reports: DRE, P&L, cash flow statements
2. **Orchestration Patterns**:
- Scheduled: Nightly ELT, weekly forecast refresh
- Event-triggered: Real-time bank sync, invoice processing
- Manual: Ad-hoc report generation, scenario modeling
- Dependency-aware: Proper sequencing of transformations
3. **Observability**:
- Asset lineage visualization (end-to-end)
- Data quality metrics per asset
- Freshness monitoring with SLA alerting
- Failure impact analysis (downstream dependencies)
4. **dbt Integration**:
- Software-defined assets from dbt models
- Unified lineage across Python + SQL
- Test execution as asset materialization dependency
- Documentation sync to Dagster catalog
5. **Compliance Features**:
- Execution logs for audit trail
- Data versioning snapshots
- Access control per asset group
- Retention policy enforcement
6. **Deliverables**:
- Dagster project structure
- Asset definitions for complete FP&A pipeline
- Sensor configurations for event triggers
- Schedule definitions
- CI/CD pipeline (GitHub Actions)
- Operational dashboard (Dagster UI customization)
- Runbook for common operational tasks
**OUTPUT FORMAT**: Production Dagster deployment with CI/CD.
**CONSTRAINTS**:
- Must handle 10M+ rows daily throughput
- Parallel execution for independent assets
- Graceful handling of source system outages
CODITECT Application
Replaces manual data pipeline management; provides audit-ready lineage documentation.
PROMPT 12: Brazilian ERP Integration Pack (Totvs/Omie/Conta Azul)
Context
Brazilian market requires native ERP integrations that no competitor offers.
Research Objective
Design integration connectors for the top 3 Brazilian ERPs with SPED/NF-e compatibility.
Prompt
You are a Brazilian ERP Integration Specialist with expertise in Totvs Protheus, Omie, and Conta Azul APIs.
**TASK**: Design integration connectors for Brazilian ERPs with local compliance features.
**REQUIREMENTS**:
1. **Totvs Protheus**:
- REST API integration (newer versions)
- AdvPL function calls (legacy)
- Standard Totvs tables: SE1 (AR), SE2 (AP), CT2 (GL)
- SPED fiscal integration
- Multi-branch support
2. **Omie**:
- REST API (modern, well-documented)
- Webhook support for real-time sync
- Data model: contas_receber, contas_pagar, lancamentos
- NF-e/NFS-e integration
- Multi-company support
3. **Conta Azul**:
- REST API integration
- OAuth 2.0 authentication
- Data model: vendas, compras, lancamentos_bancarios
- Bank sync reconciliation
- Accountant access (contador feature)
4. **Common Features**:
- Incremental sync (last_modified timestamp)
- Error handling with automatic retry
- Data validation (CPF/CNPJ, fiscal codes)
- COA mapping to unified schema
- Multi-currency (BRL primary, USD for GAAP)
5. **Brazilian Tax Compliance**:
- SPED Contábil export
- SPED Fiscal data extraction
- NF-e XML parsing
- ICMS/ISS/PIS/COFINS classification
- DCTF supporting data
6. **Deliverables**:
- Airbyte custom connector (Python)
- dbt models for Brazilian accounting patterns
- Tax classification ML model (auto-NCM mapping)
- Integration test suite with sandbox accounts
- Setup guide in Portuguese
- Support escalation playbook
**OUTPUT FORMAT**: Production connectors with Brazilian market documentation.
**CONSTRAINTS**:
- Must handle Brazilian date/number formats
- Support for CPF/CNPJ masking (LGPD)
- Accountant user workflows (separate permissions)
CODITECT Application
Creates unassailable competitive moat in R$500B Brazilian SMB market.
Execution Checklist
| Prompt | Priority | Est. Time | Dependency | Owner |
|---|---|---|---|---|
| PROMPT 6: NeuralProphet | P1 | 4 hrs | PROMPT 5 | Data Science |
| PROMPT 7: Month-End Close | P1 | 5 hrs | PROMPT 1 | Product |
| PROMPT 8: Open Finance Brazil | P1 | 6 hrs | None | Integrations |
| PROMPT 9: PostgreSQL RLS | P1 | 4 hrs | None | Platform |
| PROMPT 10: NLG Variance | P1 | 3 hrs | PROMPT 4 | AI Team |
| PROMPT 11: Dagster | P2 | 4 hrs | PROMPT 5 | Data Eng |
| PROMPT 12: Brazilian ERPs | P2 | 6 hrs | PROMPT 5 | Integrations |
Total Research Investment: ~32 hours Expected Output: 7 production-ready specifications