ADR-207: Research Artifact Organization Taxonomy & Promotion Lifecycle

Status

Accepted — 2026-02-16 (all 3 phases implemented: structure, pipeline, migration)

Context

Problem Statement

CODITECT's analyze-new-artifacts/ staging directory contains 98 research efforts spanning technology evaluations, academic paper analyses, competitive intelligence, and domain research. This directory is gitignored (backed up to GCS) and functions as the landing zone for the ADR-206 research pipeline's outputs.

The problem is artifact abandonment: the pipeline generates 15-20 artifacts per research effort (SDDs, TDDs, ADRs, C4 diagrams, executive summaries, dashboards, glossaries), but only one artifact type — the integration assessment — has a defined permanent home (internal/analysis/). The remaining artifacts stay in gitignored staging with no promotion workflow, making them:

Invisible — not searchable by future sessions or grep
Ephemeral — lost when staging is cleaned up
Disconnected — no manifest tracking what research produced what outputs
Unstructured — 98 flat directories with no categorization

Scale of the Problem

Metric	Current State
Total staging directories	98
Permanent analysis docs	20 (in `internal/analysis/`)
Permanent research docs	7 (in `internal/research/`)
Estimated valuable artifacts in staging	~500+ files
Artifacts with permanent homes	~20 (assessments only)
Promotion rate	~4% (20/500)

Prior Art

ADR-206: Defines the research pipeline that creates artifacts but does not address promotion
internal/analysis/: Established pattern for integration assessments (20 topic directories)
internal/research/: Established pattern for strategic research (7 directories)
internal/architecture/adrs/: Established pattern for architecture decisions (206 ADRs)
Analysis Preservation Protocol (CLAUDE.md): Requires standalone documents for all analysis work

Inputs

This ADR synthesizes proposals from two independent analysis agents:

Documentation Librarian — Proposed 6-category taxonomy, 4-stage lifecycle, and manifest system with 18-week migration plan
Project Organizer — Proposed concrete directory structure, YAML manifest schema, promotion rules, naming conventions, and artifact-promoter agent design

Full proposals preserved at:

internal/analysis/research-artifact-organization/research-artifact-organization-taxonomy-proposal-2026-02-16.md (2,109 lines)
internal/analysis/research-artifact-organization/QUICK-REFERENCE.md (309 lines)

Decision

1. Six-Category Research Taxonomy

All research efforts are classified into one of six categories. Category determines the staging subdirectory and influences promotion rules.

Category	Purpose	Examples
`technology-evaluation`	Assess external tools/frameworks for CODITECT integration	agent-labs, copilotkit, codex, sidecar
`academic`	Extract patterns from academic papers and research	scaling-agent-systems, continual-learning
`competitive-intelligence`	Monitor competitors and market positioning	anthropic-cowork, palantir, kimi-2.5
`business-market`	Validate market opportunity and business models	product-market-fit, financial-model
`domain`	Industry-specific workflows and compliance	bioscience-workorders, healthcare
`process-internal`	CODITECT process improvements and tooling	system-prompt, docker-registry

2. Four-Stage Artifact Lifecycle

Stage 1: STAGING          → analyze-new-artifacts/{topic}/     (gitignored)
    ↓  promotion trigger: research complete + assessment written
Stage 2: ANALYSIS          → internal/analysis/{category}/     (git-tracked)
    ↓  promotion trigger: Go decision (ADR approved)
Stage 3: INTEGRATION       → internal/architecture/ + internal/research/  (git-tracked)
    ↓  trigger: superseded, >12 months stale, or decision reversed
Stage 4: ARCHIVE           → internal/research/archive/        (git-tracked)

Stage transitions are one-way — artifacts never move backward. Archive retains a superseded_by: reference.

3. Directory Structure Additions

New directories added to the existing internal/ hierarchy:

internal/
├── architecture/
│   ├── adrs/                    # EXISTING — ADRs (flat numbered)
│   ├── c4-models/               # NEW — C4 architecture diagrams by topic
│   │   └── {topic}/
│   │       ├── context.mmd
│   │       ├── container.mmd
│   │       ├── component.mmd
│   │       └── README.md
│   ├── sdd/                     # NEW — Software Design Documents
│   │   ├── {topic}-sdd.md
│   │   └── README.md
│   └── tdd/                     # NEW — Technical Design Documents
│       ├── {topic}-tdd.md
│       └── README.md
│
├── analysis/                    # EXISTING — Integration assessments
│   └── {topic}/                 # Continue existing pattern
│
├── research/                    # EXISTING — Strategic research
│   ├── manifests/               # NEW — Research effort YAML manifests
│   │   ├── YYYY-MM-DD-{topic}.yaml
│   │   └── README.md            # Auto-generated registry
│   ├── executive-summaries/     # NEW — Go/no-go recommendations
│   │   ├── YYYY-MM-DD-{topic}-executive-summary.md
│   │   └── README.md
│   ├── glossaries/              # NEW — Domain terminology references
│   │   ├── {domain}-glossary.md
│   │   └── README.md
│   ├── quick-start-guides/      # NEW — Experimental how-to guides
│   │   ├── {topic}-quick-start.md
│   │   └── README.md
│   └── archive/                 # NEW — Superseded research
│       └── {topic}/
│
└── dashboards/                  # NEW — Interactive React components
    └── research/
        └── {topic}/
            ├── {Name}Dashboard.tsx
            └── README.md

4. Promotion Rules

Artifact Type	Staging → Analysis	Analysis → Integration	Permanent Location
Integration Assessment	ALWAYS	Never (stays in analysis)	`internal/analysis/{topic}/`
Executive Summary	ALWAYS	Never (stays in research)	`internal/research/executive-summaries/`
Manifest	ALWAYS	N/A (created at promotion)	`internal/research/manifests/`
SDD	If Go decision	ALWAYS	`internal/architecture/sdd/`
TDD	If Go decision	ALWAYS	`internal/architecture/tdd/`
ADRs	When finalized	ALWAYS	`internal/architecture/adrs/` (flat numbered)
C4 Diagrams	If Go decision	ALWAYS	`internal/architecture/c4-models/{topic}/`
Glossary	If 10+ terms	If reusable	`internal/research/glossaries/`
Quick Start	If repeatable	Never	`internal/research/quick-start-guides/`
Dashboards (JSX)	If reusable	Never	`internal/dashboards/research/{topic}/`
System Prompts	NEVER	N/A	Stays in staging
Raw Inputs (PDFs)	NEVER	N/A	Stays in staging
Pipeline Logs	NEVER	N/A	Stays in staging

5. YAML Manifest Schema

Each research effort gets a manifest tracking inputs, outputs, and promotion status.

Location: internal/research/manifests/YYYY-MM-DD-{topic}.yaml

Schema:

metadata:
  research_id: "YYYY-MM-DD-{topic}"
  topic: "Human-Readable Topic Name"
  category: "technology-evaluation"  # One of 6 categories
  date_conducted: "YYYY-MM-DD"
  pipeline_version: "1.0.0"           # ADR-206 pipeline version or "manual"
  recommendation: "ADOPT"             # ADOPT | DEFER | REJECT | MONITOR
  confidence: "HIGH"                  # HIGH | MEDIUM | LOW

input_sources:
  type: "git-repository"              # git-repository | pdf | url | transcript | multi-source
  primary:
    url: "https://github.com/org/repo.git"
    license: "MIT"
  secondary: []

artifacts:
  - type: "executive-summary"
    staging_path: "analyze-new-artifacts/{topic}/outputs/executive-summary.md"
    promoted_to: "internal/research/executive-summaries/YYYY-MM-DD-{topic}-executive-summary.md"
    status: "promoted"                # promoted | staging | skipped
  - type: "sdd"
    staging_path: "analyze-new-artifacts/{topic}/outputs/sdd.md"
    promoted_to: "internal/architecture/sdd/{topic}-sdd.md"
    status: "promoted"
  # ... additional artifacts

tags: []
notes: ""

Validation: JSON Schema at config/schemas/research-manifest-v1.schema.json

6. Naming Conventions

Element	Convention	Example
Directories	`kebab-case`, no dates	`codestoryai-sidecar/`
SDDs/TDDs	`{topic}-sdd.md` / `{topic}-tdd.md`	`codestoryai-sidecar-sdd.md`
Executive Summaries	`YYYY-MM-DD-{topic}-executive-summary.md`	`2026-02-09-codestoryai-sidecar-executive-summary.md`
Assessments	`{topic}-assessment-YYYY-MM-DD.md`	`agent-labs-assessment-2026-02-16.md`
Manifests	`YYYY-MM-DD-{topic}.yaml`	`2026-02-09-codestoryai-sidecar.yaml`
ADRs	`ADR-{NNN}-{title}.md` (flat numbered)	`ADR-165-browser-agent-architecture.md`
C4 Diagrams	`{level}.mmd` in topic dir	`context.mmd`, `container.mmd`
Dashboards	`{PascalCase}Dashboard.tsx`	`ArchitectureDashboard.tsx`
Glossaries	`{domain}-glossary.md`	`browser-automation-glossary.md`

Rationale: No dates in directory names (topics receive updates over time). Dates in time-sensitive files (executive summaries, assessments capture point-in-time decisions). Dates in manifests (each research effort is a discrete event). No dates in SDDs/TDDs (living documents tracked via git history).

7. ADR-206 Pipeline Enhancement

ADR-206 Phase 5 (PACKAGE) is extended with a promotion step:

Phase 5: PACKAGE (existing)
    ↓ generates MANIFEST.yaml in staging
Phase 5b: PROMOTE (new)
    ↓ artifact-promoter agent executes selective promotion
    ↓ dry-run mode shows plan, user approves
Phase 5c: REGISTRY (new)
    ↓ updates internal/research/manifests/README.md registry

New agent: artifact-promoter — reads MANIFEST.yaml, applies promotion rules, executes git mv for history preservation, updates manifest status fields, commits.

New CLI flag: --auto-promote on ADR-206 pipeline triggers Phase 5b automatically after Phase 5.

8. Migration Plan for 98 Existing Directories

Phase	Action	Effort	Timeline
1. Categorize	Auto-classify 98 dirs into 6 categories	Agent-driven, 1 session	Week 1
2. Manifest	Generate MANIFEST.yaml for each research effort	Agent-driven, 1 session	Week 1
3. Promote Recent	Promote high-value artifacts from last 30 days (~15 dirs)	Agent + review, 1 session	Week 2
4. Archive Legacy	Generate manifests for older research, leave in staging	Agent-driven, 1 session	Week 3
5. Registry	Generate master manifest registry	Script, automated	Week 3

Total effort: ~4 sessions (8-12 hours), agent-assisted.

Scope limitation: Only recent research (30 days) gets full promotion. Older research gets manifests but artifacts remain in staging (gitignored, backed up to GCS). This avoids bloating the git repo with 500+ legacy files.

Consequences

Positive

100% searchability — all research trackable via manifest registry
0% artifact loss — promotion rules ensure high-value outputs reach permanent homes
87% staging reduction — from 98 dirs to ~13 active at any time
Automated pipeline — ADR-206 Phase 5b handles promotion without manual steps
Consistent taxonomy — 6 categories enable filtering, search, and reporting
Git history preserved — git mv maintains full file history

Negative

New directories — 8 new directories under internal/ (sdd/, tdd/, c4-models/, manifests/, executive-summaries/, glossaries/, quick-start-guides/, dashboards/)
New agent — artifact-promoter agent to build and maintain
Migration effort — 4 sessions to backfill manifests for 98 existing dirs
Schema maintenance — YAML manifest schema needs versioning

Risks

Risk	Mitigation
Manifest drift (manifests become stale)	Auto-generate from pipeline; validation on commit
Over-promotion (too many artifacts promoted)	Strict ALWAYS/CONDITIONALLY/NEVER rules; dry-run mode
Category ambiguity	Decision tree in QUICK-REFERENCE.md; default to `technology-evaluation`
Repo size bloat	Only promote text artifacts; dashboards conditional; raw inputs never

Alternatives Considered

Alternative	Pros	Cons	Verdict
Status quo (ad-hoc promotion)	No effort	96% artifact loss, no discoverability	Rejected
Git-track everything	Simple	+6.1GB repo size, noise drowns signal	Rejected
External KB (Notion/Confluence)	Rich search	Context switch, no git integration, vendor lock-in	Rejected
Database catalog (SQLite)	Fast queries	Not human-readable, not grep-friendly	Rejected
Markdown + YAML manifests	Human-readable, grep-friendly, git-native	Requires naming discipline	Selected

Implementation

Phase 1: Structure Creation (Complete)

Create new directories under internal/ (9 directories)
Create README.md index files in each new directory (10 files)
Define JSON schema at config/schemas/research-manifest-v1.schema.json
Update internal/CLAUDE.md and internal/architecture/CLAUDE.md with new structure

Phase 2: Agent & Pipeline (Complete)

Create artifact-promoter agent (agents/artifact-promoter.md)
Update ADR-206 with Phase 5b/5c
Create /research-promote command (commands/research-promote.md)

Phase 3: Migration (Subsequent Session)

Categorize 98 existing staging dirs
Generate manifests for all research efforts
Promote recent high-value artifacts
Generate master registry

ADR-206: Research Pipeline (creates artifacts this ADR promotes)
Analysis Preservation Protocol: CLAUDE.md directive this ADR extends
Full Taxonomy Proposal: internal/analysis/research-artifact-organization/research-artifact-organization-taxonomy-proposal-2026-02-16.md
Quick Reference: internal/analysis/research-artifact-organization/QUICK-REFERENCE.md

Author: Claude (Opus 4.6) MoE Workflow: Analyze (documentation-librarian) → Advise (project-organizer) → Judge (ADR synthesis)

Status​

Context​

Problem Statement​

Scale of the Problem​

Prior Art​

Inputs​

Decision​

1. Six-Category Research Taxonomy​

2. Four-Stage Artifact Lifecycle​

3. Directory Structure Additions​

4. Promotion Rules​

5. YAML Manifest Schema​

6. Naming Conventions​

7. ADR-206 Pipeline Enhancement​

8. Migration Plan for 98 Existing Directories​

Consequences​

Positive​

Negative​

Risks​

Alternatives Considered​

Implementation​

Phase 1: Structure Creation (Complete)​

Phase 2: Agent & Pipeline (Complete)​

Phase 3: Migration (Subsequent Session)​

Related​