CONFIDENTIAL — AZ1.AI Inc. — Internal Use Only
CFS-005: AI Strategy & Machine Learning Architecture
1. Executive Summary
AI is not a feature of the CODITECT Financial Suite — it is the architectural foundation. The global shortage of 300,000+ accounting professionals demands that AI handle the volume work (document processing, categorization, reconciliation, compliance checking) while humans focus on judgment work (advisory, strategy, exception handling).
Core principle: Every financial workflow has an AI co-pilot. Not replacing accountants — amplifying them 10x.
2. AI Capability Matrix
| Module | AI Capability | Model/Approach | Accuracy Target | Phase |
|---|---|---|---|---|
| Document Intelligence | OCR + Entity Extraction + Classification + Auto-Coding | LayoutLM v3 + Custom NER + Classification | >95% extraction, >90% auto-coding | 1 |
| Bank Reconciliation | Transaction Matching | Fuzzy matching + ML ranking + rules | >90% auto-match | 1 |
| General Ledger | Anomaly Detection + Auto-Categorization | Statistical (Z-score, IQR) + Isolation Forest | >85% auto-categorization | 1 |
| Accounts Payable | Invoice Processing + Duplicate Detection | Vision + NLP pipeline + similarity scoring | >92% straight-through processing | 2 |
| Accounts Receivable | Payment Prediction + Dunning Optimization | Time-series + Classification + survival analysis | >80% payment date prediction | 2 |
| Tax Engine | Jurisdiction Detection + Filing Prep | Rule engine + NLP for document analysis | >99% jurisdiction accuracy | 2 |
| FP&A | NLQ + Forecasting + Variance Explanation | Claude API + NeuralProphet + SHAP | MAPE <15% (12-month) | 2 |
| Month-End Close | Bottleneck Prediction + Auto-Scheduling | Process mining + optimization | 30% close time reduction | 3 |
| Practice Management | Client Risk Scoring + Deadline Prediction | Classification + survival analysis | >85% risk classification | 3 |
| Consolidation | Intercompany Matching + Elimination Suggestion | Pattern matching + rule engine | >95% auto-elimination | 3 |
3. Document Intelligence Pipeline (Deep Dive)
Stage 1: Ingestion
- Formats: PDF, JPG/PNG/TIFF (image), email attachment, XML (UBL/CII), EDI, CSV
- Channels: Upload, email forward, API, mobile camera capture, bulk import
- Pre-processing: De-skew, noise removal, resolution enhancement for images
Stage 2: OCR Engine
- Primary: Tesseract 5 (open-source, multi-language, GPU-accelerated)
- Secondary: EasyOCR (better on handwriting, Asian languages)
- Fallback: Google Cloud Vision API (cloud fallback for difficult documents)
- Languages: PT, EN, ES, FR, DE, IT, PL, JA, ZH, AR (minimum 10)
Stage 3: Layout Analysis
- Model: LayoutLM v3 (Microsoft) — understands document structure
- Outputs: Identifies headers, tables, line items, totals, tax sections
- Training: Fine-tuned on financial document corpus (invoices, receipts, bank statements)
Stage 4: Entity Extraction
- Amounts: Total, subtotal, tax, line item amounts, currency symbols
- Dates: Invoice date, due date, payment date, period
- Identifiers: Vendor/customer name, tax ID (CNPJ, EIN, VAT), invoice number
- Line Items: Description, quantity, unit price, amount, account code
- Currencies: Detection from symbols, codes, or context
Stage 5: Document Classification
- Categories: Invoice (AP/AR), receipt, bank statement, contract, tax form, payroll, expense report
- Sub-categories: By vendor type, expense category, department
- Confidence: High (>95%), Medium (80-95%), Low (<80%)
Stage 6: GL Account Auto-Coding
- Method: Historical pattern matching (what accounts were used for similar transactions)
- Features: Vendor, amount range, description keywords, line items, document type
- Model: Gradient boosted trees (XGBoost) trained on tenant-specific history
- Fallback: Global model for new tenants, transitioning to tenant-specific as data accumulates
Stage 7: Confidence Scoring & Routing
| Confidence | Action |
|---|---|
| >95% | Auto-post (configurable threshold per tenant) |
| 80-95% | Review queue with AI suggestion pre-filled |
| <80% | Manual processing queue |
Stage 8: Learning Loop
- Every correction by a human is logged
- Monthly model retraining on correction data
- Accuracy metrics tracked per tenant, per document type
- Goal: confidence threshold drifts upward over time
4. Natural Language Query (NLQ) Engine
Architecture
User Input: "What's my AP aging over 90 days by vendor?"
│
▼
Intent Classification (Claude API)
→ Domain: AP
→ Query type: Aging report
→ Filters: >90 days, group by vendor
│
▼
SQL Generation (Claude API + schema context)
→ SELECT vendor_name, SUM(amount), days_outstanding
FROM ap_invoices WHERE days_outstanding > 90
GROUP BY vendor_name ORDER BY SUM(amount) DESC
│
▼
Safety Validation
→ Read-only? ✓
→ Row limit? ✓ (LIMIT 10000)
→ Timeout? ✓ (30s)
→ Tenant-scoped? ✓ (RLS active)
│
▼
Execution (read replica)
│
▼
Response Generation (Claude API)
→ "You have $245,000 in AP aging over 90 days across 12 vendors.
The top 3 are: Supplier ABC ($89,000), Vendor XYZ ($52,000)..."
Multi-Language NLQ
- Query in Portuguese: "Qual é meu contas a pagar vencido há mais de 90 dias?"
- SQL generation is language-agnostic (schema is English)
- Response generated in the user's language
- Account labels displayed in user's language (from i18n system)
5. Forecasting Engine
Model Ensemble
| Model | Strengths | Use Case |
|---|---|---|
| NeuralProphet | Seasonality, trend, changepoints | Revenue, expense forecasting |
| ARIMA/SARIMAX | Statistical rigor, well-understood | Baseline comparison |
| ETS | Simple, fast, interpretable | Short-term cash flow |
| Ensemble | Combines strengths, reduces variance | Production forecasts |
Features
- Time features: Day of week, month, quarter, fiscal period, holidays
- Macro indicators: GDP growth, inflation rate, exchange rates, industry indices
- Business features: Headcount, sales pipeline, contract values
- Lagged features: Prior period actuals, year-over-year trends
Output
- Point forecast + 80% and 95% confidence intervals
- 3 scenarios: base (median), optimistic (p90), pessimistic (p10)
- Monte Carlo simulation for cash flow scenarios (1,000 iterations)
Explainability
- SHAP values for every forecast driver
- Natural language explanation: "Revenue is projected to increase 12% primarily due to seasonal Q4 uplift (40% contribution) and 3 new client acquisitions (35% contribution)."
6. LLM Strategy
Model Selection
| Use Case | Model | Deployment | Cost/1K tokens |
|---|---|---|---|
| NLQ (complex reasoning) | Claude Sonnet 4 | API | $3/$15 |
| Financial analysis | Claude Sonnet 4 | API | $3/$15 |
| Document classification | Mistral 7B | Local (Ollama) | ~$0.001 |
| Entity extraction | Custom NER | Local (PyTorch) | ~$0.0001 |
| Anomaly explanation | Claude Haiku 4 | API | $0.25/$1.25 |
| Auto-coding suggestion | XGBoost | Local (scikit-learn) | ~$0.00001 |
Privacy Architecture
- Sensitive data (financial amounts, PII): Processed by local models only
- Anonymized data (patterns, categories): May use cloud APIs
- No training on customer data: Strict contractual commitment
- Prompt sanitization: PII stripped before any cloud API call
Cost Optimization
- Route 80% of AI tasks to local models (low cost)
- Reserve cloud APIs for complex reasoning tasks (20%)
- Estimated AI cost: $0.50-1.00 per client per month at scale
- Cache common NLQ queries and responses
7. AI Safety & Governance
| Principle | Implementation |
|---|---|
| Human-in-the-loop | All AI outputs are suggestions; auto-posting requires configurable confidence threshold |
| Explainability | Every AI decision includes explanation (SHAP, attention weights, rule trace) |
| Audit trail | All AI-assisted transactions logged: model version, confidence, explanation, human override |
| Bias detection | Monthly audit of auto-categorization patterns for systematic bias |
| Model versioning | All models versioned; rollback capability within 1 hour |
| Regulatory limits | No AI for final tax filing decisions without human review |
| Error bounds | Confidence thresholds configurable per tenant; default conservative |
| Feedback loop | Every correction improves the model; but batch retrained, not real-time (prevent manipulation) |
8. AI Development Roadmap
| Phase | Timeline | Capabilities | Key Metrics |
|---|---|---|---|
| Phase 1 | Months 1-6 | Document Intelligence v1, Bank Rec AI matching, GL anomaly detection | >85% match rate, >90% OCR accuracy |
| Phase 2 | Months 7-12 | NLQ engine, Forecasting, AP automation, auto-categorization | <15% MAPE, >90% straight-through |
| Phase 3 | Months 13-18 | Full AI co-pilot, Practice management intelligence, Consolidation automation | 10x client capacity, 30% close reduction |
| Phase 4 | Months 19-24 | Autonomous agent workflows, marketplace AI models, federated learning | Autonomous reconciliation, cross-tenant model improvement |
9. AI Infrastructure
| Component | Specification | Purpose |
|---|---|---|
| Inference GPU | NVIDIA L4 or A10G (GKE) | LayoutLM, local LLM inference |
| Training GPU | NVIDIA A100 (on-demand) | Monthly model retraining |
| Model Serving | vLLM (local LLMs) | High-throughput inference |
| ML Tracking | MLflow | Experiment tracking, model registry |
| Feature Store | Feast or custom PostgreSQL | Centralized feature engineering |
| Model Monitoring | Custom (Prometheus metrics) | Drift detection, accuracy degradation |
| Data Labeling | Label Studio (self-hosted) | Human-in-the-loop quality control |
Cost Model (Per Tenant/Month at Scale)
| Component | Cost |
|---|---|
| GPU inference (shared) | $0.15 |
| Cloud LLM API (Claude) | $0.30 |
| Storage (models, embeddings) | $0.05 |
| Total AI cost/tenant | $0.50 |
At $65 ARPC, AI cost is <1% of revenue — highly favorable unit economics.
10. Competitive AI Analysis
| Competitor | AI Investment | Capabilities | Architecture |
|---|---|---|---|
| QuickBooks | $100M+ (Intuit AI) | Basic categorization, receipt scanning | Bolt-on to legacy |
| Xero | $50M+ | Basic bank rec suggestions, invoice reminders | Feature-level AI |
| Sage | $80M+ (Sage Copilot) | NLQ on Sage data, basic automation | Copilot layer |
| Oracle NetSuite | $200M+ (Oracle AI) | Anomaly detection, basic forecasting | Enterprise bolt-on |
| SAP | $500M+ (Joule) | Generic enterprise AI assistant | Cross-product, shallow |
| Microsoft D365 | $1B+ (Copilot) | Generic Copilot across Office + D365 | Broad, not deep |
| CODITECT | Purpose-built | Deep domain AI in every workflow | AI-first architecture |
CODITECT's structural advantage: Incumbents are adding AI to 20-year-old codebases. CODITECT builds AI into the data model, the workflow engine, and the user experience from day one. Every table has embedding columns. Every workflow has an AI decision point. Every interaction generates training data.
Hal Casteel CEO/CTO, AZ1.AI Inc.
Copyright © 2026 AZ1.AI Inc. All rights reserved. Unauthorized distribution or reproduction is strictly prohibited.