Skip to main content

Project Development Methodology

Project Development Methodology

This skill covers principles for identifying tasks suited to LLM processing, designing effective project architectures, and iterating rapidly using agent-assisted development. The methodology applies whether building a batch processing pipeline, a multi-agent research system, or an interactive agent application.

When to Use

Use this skill when:

  • Starting a new project that might benefit from LLM processing
  • Evaluating whether a task is well-suited for agents vs traditional code
  • Designing architecture for an LLM-powered application
  • Planning a batch processing pipeline with structured outputs
  • Choosing between single-agent and multi-agent approaches
  • Estimating costs and timelines for LLM-heavy projects

Don't use this skill when:

  • Traditional non-LLM software projects
  • Simple single-query applications
  • Tasks with deterministic algorithms available

Task-Model Fit Recognition

Not every problem benefits from LLM processing. Evaluate task-model fit before writing any code.

LLM-Suited Tasks

CharacteristicWhy It Fits
Synthesis across sourcesLLMs excel at combining information
Subjective judgment with rubricsGrading, evaluation, classification
Natural language outputHuman-readable text goals
Error toleranceIndividual failures don't break system
Batch processingNo conversational state between items
Domain knowledge in trainingModel has relevant context

LLM-Unsuited Tasks

CharacteristicWhy It Fails
Precise computationMath, counting unreliable
Real-time requirementsLLM latency too high
Perfect accuracy requirementsHallucination risk
Proprietary data dependenceModel lacks context
Sequential dependenciesEach step depends on previous
Deterministic output requirementsSame input must = identical output

The Manual Prototype Step

Before investing in automation, validate with a manual test:

  1. Copy one representative input into model interface
  2. Evaluate output quality
  3. This takes minutes and prevents hours of wasted development

What Manual Prototyping Answers

  • Does the model have required knowledge?
  • Can it produce output in needed format?
  • What quality level to expect at scale?
  • Are there obvious failure modes?

If the manual prototype fails, the automated system will fail.

Pipeline Architecture

LLM projects benefit from staged pipeline architectures where each stage is:

PropertyDescription
DiscreteClear boundaries between stages
IdempotentRe-running produces same result
CacheableIntermediate results persist to disk
IndependentEach stage can run separately

The Canonical Pipeline

acquire → prepare → process → parse → render
StageFunctionDeterministicCost
AcquireFetch raw dataYesLow
PrepareTransform to promptsYesLow
ProcessExecute LLM callsNoHigh
ParseExtract structured dataYesLow
RenderGenerate final outputsYesLow

Key insight: Only stage 3 is non-deterministic and expensive. Separate it to re-run only when necessary.

File System as State Machine

Use the file system to track pipeline state instead of databases or in-memory structures.

data/{id}/
├── raw.json # acquire stage complete
├── prompt.md # prepare stage complete
├── response.md # process stage complete
├── parsed.json # parse stage complete

Benefits

BenefitDescription
Natural idempotencyFile existence gates execution
Easy debuggingAll state is human-readable
Simple parallelizationEach directory is independent
Trivial cachingFiles persist across runs

State Check Pattern

def needs_processing(item_id: str, stage: str) -> bool:
output_file = f"data/{item_id}/{stage}_output.json"
return not os.path.exists(output_file)

Structured Output Design

When LLM outputs must be parsed programmatically, prompt design determines parsing reliability.

Effective Structure Specification

Analyze the following and provide response in exactly this format:

## Summary
[Your summary here]

## Score
Rating: [1-10]

## Details
- Key point 1
- Key point 2

Follow this format exactly because I will be parsing it programmatically.

Parsing Best Practices

  • Use regex patterns flexible enough for minor variations
  • Provide sensible defaults when sections missing
  • Log parsing failures for review rather than crashing

Cost and Scale Estimation

LLM processing has predictable costs. Estimate before starting.

Formula

Total cost = (items × tokens_per_item × price_per_token) + API overhead

Estimation Process

  1. Estimate input tokens per item (prompt + context)
  2. Estimate output tokens per item (typical response length)
  3. Multiply by item count
  4. Add 20-30% buffer for retries and failures

Cost Reduction Strategies

StrategyImpact
Reduce context lengthDirect token savings
Use smaller models for simple itemsLower per-token cost
Cache and reuse partial resultsAvoid redundant calls
Parallel processingWall-clock time (not token cost)

Single vs Multi-Agent Architecture

Single-Agent Pipelines Work For

  • Batch processing with independent items
  • Tasks where items don't interact
  • Simpler cost and complexity management

Multi-Agent Architectures Work For

  • Parallel exploration of different aspects
  • Tasks exceeding single context window
  • When specialized sub-agents improve quality

Primary reason for multi-agent is context isolation, not role anthropomorphization.

Architectural Reduction

Start with minimal architecture. Add complexity only when proven necessary.

Evidence: Vercel d0

  • Before: 17 specialized tools, 80% success, 274s execution
  • After: 2 primitives (bash + SQL), 100% success, 77s execution

Build for Change

  • Keep architecture simple and unopinionated
  • Test across model strengths
  • Design systems that benefit from model improvements

Project Planning Template

1. Task Analysis

  • What is input/output?
  • Is this synthesis, generation, classification, or analysis?
  • What error rate is acceptable?
  • What is value per successful completion?

2. Manual Validation

  • Test one example with target model
  • Evaluate output quality and format
  • Identify failure modes
  • Estimate tokens per item

3. Architecture Selection

  • Single pipeline vs multi-agent
  • Required tools and data sources
  • Storage and caching strategy
  • Parallelization approach

4. Cost Estimation

  • Items × tokens × price
  • Development time
  • Infrastructure requirements
  • Ongoing operational costs

5. Development Plan

  • Stage-by-stage implementation
  • Testing strategy per stage
  • Iteration milestones
  • Deployment approach

Anti-Patterns to Avoid

Anti-PatternProblemSolution
Skipping manual validationWasted development timeAlways prototype first
Monolithic pipelinesHard to debug/iterateSeparate stages with persistent outputs
Over-constraining the modelScaffolding becomes liabilityTest if scaffolding helps or hurts
Ignoring costs until productionUnexpected bills at scaleEstimate and track from beginning
Perfect parsing requirementsLLMs aren't deterministicBuild robust parsers
Premature optimizationComplexity before correctnessGet basic pipeline working first

Guidelines

  1. Validate task-model fit with manual prototyping before automation
  2. Structure pipelines as discrete, idempotent, cacheable stages
  3. Use the file system for state management and debugging
  4. Design prompts for structured, parseable outputs with format examples
  5. Start with minimal architecture; add complexity only when proven necessary
  6. Estimate costs early and track throughout development
  7. Build robust parsers that handle LLM output variations
  8. Expect and plan for multiple architectural iterations
  9. Test whether scaffolding helps or constrains model performance
  10. Use agent-assisted development for rapid iteration

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: project-development

Completed:
- [x] Task-model fit validated via manual prototype
- [x] Pipeline architecture designed (5 stages)
- [x] File system state management implemented
- [x] Cost estimation completed with buffer
- [x] Architecture selection documented

Outputs:
- Task-model fit assessment report
- 5-stage pipeline: acquire → prepare → process → parse → render
- File system structure: data/{id}/{stage}_output.json
- Cost estimate: items × tokens × price + 20-30% buffer
- Architecture decision: single-agent vs multi-agent with rationale

Completion Checklist

Before marking this skill as complete, verify:

  • Manual prototype tested with representative input
  • Task-model fit criteria evaluated (synthesis, judgment, NL output, error tolerance)
  • Pipeline stages discrete, idempotent, cacheable, independent
  • File system state management implemented (existence gates execution)
  • Structured output format specified in prompts
  • Cost estimation includes items, tokens, pricing, buffer
  • Single vs multi-agent decision documented
  • Architecture follows reduction principle (minimal complexity)

Failure Indicators

This skill has FAILED if:

  • ❌ Skipped manual prototype validation
  • ❌ Task unsuited for LLM (deterministic algorithm better)
  • ❌ Monolithic pipeline (no discrete stages)
  • ❌ In-memory state only (no file system persistence)
  • ❌ No structured output specification
  • ❌ Cost estimation missing or unrealistic
  • ❌ Over-engineered architecture before validation
  • ❌ Ignored task-model fit criteria

When NOT to Use

Do NOT use this skill when:

  • Traditional non-LLM software projects (deterministic algorithms)
  • Precise computation required (math, counting)
  • Real-time requirements (LLM latency too high)
  • Perfect accuracy needed (hallucination risk)
  • Proprietary data only (model lacks context)
  • Sequential dependencies (each step depends on previous)
  • Deterministic output required (same input = identical output)

Use alternative approaches when:

  • Need precision → Traditional algorithms
  • Need real-time → Cached LLM or deterministic logic
  • Need perfect accuracy → Rule-based systems
  • Need proprietary context → RAG with embedding
  • Need sequential → State machine or workflow engine

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Skipping manual validationWasted development timeAlways prototype with 1 example first
Monolithic pipelinesHard to debug, iterateSeparate into 5 discrete stages
Over-constraining modelScaffolding becomes liabilityTest if scaffolding helps or hurts
Ignoring costs until productionUnexpected bills at scaleEstimate items × tokens × price early
Perfect parsing requirementsLLMs aren't deterministicBuild robust parsers with defaults
Premature optimizationComplexity before correctnessGet basic pipeline working first
Database for stateComplexity without benefitUse file system for state tracking
Assuming model knowledgeHallucinations undetectedValidate proprietary data requirements

Principles

This skill embodies these CODITECT principles:

#1 Recycle → Extend → Re-Use → Create

  • Reuse canonical 5-stage pipeline pattern
  • Extend with domain-specific stages
  • Create only when pattern doesn't fit

#2 First Principles

  • Manual prototype validates core assumption
  • Task-model fit criteria based on LLM strengths
  • File system state = natural idempotency

#3 Keep It Simple

  • Start with minimal architecture
  • Add complexity only when proven necessary
  • Single-agent until multi-agent needed

#9 Automation First

  • Staged pipeline enables automation
  • File system gates prevent re-execution
  • Idempotent stages allow retries

Cost Awareness

  • Early estimation prevents budget surprises
  • Token budget tracked per stage
  • 20-30% buffer for retries/failures

Skills

  • context-fundamentals - Context constraints for prompt design
  • tool-design - Designing tools for pipelines
  • multi-agent-patterns - When to use multi-agent
  • advanced-evaluation - Evaluating pipeline outputs

Agents

  • multi-agent-coordinator - Coordinate multi-agent pipelines

Commands

  • /batch-pipeline - Manage staged batch processing

Scripts

  • external/Agent-Skills-for-Context-Engineering/skills/project-development/scripts/pipeline_template.py - 5-stage pipeline implementation