ADR-005: Cumulative Knowledge Moat as Competitive Strategy
Status
Accepted
Context
CODITECT competes against workflow-based tools (Cursor, Copilot) and general-purpose document processing platforms (Unstructured, LlamaParse). A sustainable competitive advantage requires a moat that deepens with usage — not just a feature that can be replicated. The UDOM Pipeline's output (high-fidelity typed components from scientific papers) has properties that create such a moat.
Decision
Position the UDOM Pipeline as a cumulative knowledge asset strategy where every paper processed by a tenant becomes a queryable, structured knowledge component in their proprietary corpus. The moat mechanism: switching cost equals the entire processed knowledge base. A tenant with 10,000 papers processed through UDOM (each with 300+ typed components, quality scores, provenance chains, and cross-paper citation links) cannot migrate this structured corpus to a competitor that only offers raw text extraction.
Key value creation layers:
- Direct value: Time savings (360–1,440× vs. manual) and quality improvement (100% Grade A)
- Platform value: Agents gain structured knowledge tools (search equations, compare tables, traverse citations)
- Moat value: Per-tenant corpora are cumulative, proprietary, and non-portable — the processed knowledge base IS the product
Consequences
Positive:
- Switching cost increases with each paper processed — natural retention mechanism
- Network effects within tenant: more papers → better cross-paper analysis → more value per paper
- Aligns CODITECT's pricing with value delivered (per-paper processing + corpus access)
- Differentiates from competitors who can only match individual extraction quality, not the cumulative corpus
Negative:
- Requires investment in corpus management features (search, deduplication, citation graphs) beyond extraction
- Risk of vendor lock-in perception if not paired with export capabilities
Neutral:
- Export in standard formats (markdown, BibTeX, JSON) should be available to mitigate lock-in concerns while preserving structured value
- The moat is the structured, typed, quality-scored corpus — raw export is available, but the intelligence layer is CODITECT-specific
Alternatives Considered
-
Feature-based differentiation only: Compete on extraction quality alone. Rejected — any single extraction improvement can be replicated; quality is necessary but not sufficient for a moat.
-
Open-source the pipeline, monetize hosting: Builds community but eliminates moat. Rejected for enterprise SaaS positioning in regulated industries.
-
API-only model (no stored corpus): Process papers on demand, don't persist. Rejected — eliminates cumulative value and cross-paper analysis capabilities.