Project Development Methodology

This skill covers principles for identifying tasks suited to LLM processing, designing effective project architectures, and iterating rapidly using agent-assisted development. The methodology applies whether building a batch processing pipeline, a multi-agent research system, or an interactive agent application.

When to Use

✅ Use this skill when:

Starting a new project that might benefit from LLM processing
Evaluating whether a task is well-suited for agents vs traditional code
Designing architecture for an LLM-powered application
Planning a batch processing pipeline with structured outputs
Choosing between single-agent and multi-agent approaches
Estimating costs and timelines for LLM-heavy projects

❌ Don't use this skill when:

Traditional non-LLM software projects
Simple single-query applications
Tasks with deterministic algorithms available

Task-Model Fit Recognition

Not every problem benefits from LLM processing. Evaluate task-model fit before writing any code.

LLM-Suited Tasks

Characteristic	Why It Fits
Synthesis across sources	LLMs excel at combining information
Subjective judgment with rubrics	Grading, evaluation, classification
Natural language output	Human-readable text goals
Error tolerance	Individual failures don't break system
Batch processing	No conversational state between items
Domain knowledge in training	Model has relevant context

LLM-Unsuited Tasks

Characteristic	Why It Fails
Precise computation	Math, counting unreliable
Real-time requirements	LLM latency too high
Perfect accuracy requirements	Hallucination risk
Proprietary data dependence	Model lacks context
Sequential dependencies	Each step depends on previous
Deterministic output requirements	Same input must = identical output

The Manual Prototype Step

Before investing in automation, validate with a manual test:

Copy one representative input into model interface
Evaluate output quality
This takes minutes and prevents hours of wasted development

What Manual Prototyping Answers

Does the model have required knowledge?
Can it produce output in needed format?
What quality level to expect at scale?
Are there obvious failure modes?

If the manual prototype fails, the automated system will fail.

Pipeline Architecture

LLM projects benefit from staged pipeline architectures where each stage is:

Property	Description
Discrete	Clear boundaries between stages
Idempotent	Re-running produces same result
Cacheable	Intermediate results persist to disk
Independent	Each stage can run separately

The Canonical Pipeline

acquire → prepare → process → parse → render

Stage	Function	Deterministic	Cost
Acquire	Fetch raw data	Yes	Low
Prepare	Transform to prompts	Yes	Low
Process	Execute LLM calls	No	High
Parse	Extract structured data	Yes	Low
Render	Generate final outputs	Yes	Low

Key insight: Only stage 3 is non-deterministic and expensive. Separate it to re-run only when necessary.

File System as State Machine

Use the file system to track pipeline state instead of databases or in-memory structures.

data/{id}/
├── raw.json         # acquire stage complete
├── prompt.md        # prepare stage complete
├── response.md      # process stage complete
├── parsed.json      # parse stage complete

Benefits

Benefit	Description
Natural idempotency	File existence gates execution
Easy debugging	All state is human-readable
Simple parallelization	Each directory is independent
Trivial caching	Files persist across runs

State Check Pattern

def needs_processing(item_id: str, stage: str) -> bool:
    output_file = f"data/{item_id}/{stage}_output.json"
    return not os.path.exists(output_file)

Structured Output Design

When LLM outputs must be parsed programmatically, prompt design determines parsing reliability.

Effective Structure Specification

Analyze the following and provide response in exactly this format:

## Summary
[Your summary here]

## Score
Rating: [1-10]

## Details
- Key point 1
- Key point 2

Follow this format exactly because I will be parsing it programmatically.

Parsing Best Practices

Use regex patterns flexible enough for minor variations
Provide sensible defaults when sections missing
Log parsing failures for review rather than crashing

Cost and Scale Estimation

LLM processing has predictable costs. Estimate before starting.

Formula

Total cost = (items × tokens_per_item × price_per_token) + API overhead

Estimation Process

Estimate input tokens per item (prompt + context)
Estimate output tokens per item (typical response length)
Multiply by item count
Add 20-30% buffer for retries and failures

Cost Reduction Strategies

Strategy	Impact
Reduce context length	Direct token savings
Use smaller models for simple items	Lower per-token cost
Cache and reuse partial results	Avoid redundant calls
Parallel processing	Wall-clock time (not token cost)

Single vs Multi-Agent Architecture

Single-Agent Pipelines Work For

Batch processing with independent items
Tasks where items don't interact
Simpler cost and complexity management

Multi-Agent Architectures Work For

Parallel exploration of different aspects
Tasks exceeding single context window
When specialized sub-agents improve quality

Primary reason for multi-agent is context isolation, not role anthropomorphization.

Architectural Reduction

Start with minimal architecture. Add complexity only when proven necessary.

Evidence: Vercel d0

Before: 17 specialized tools, 80% success, 274s execution
After: 2 primitives (bash + SQL), 100% success, 77s execution

Build for Change

Keep architecture simple and unopinionated
Test across model strengths
Design systems that benefit from model improvements

Project Planning Template

1. Task Analysis

What is input/output?
Is this synthesis, generation, classification, or analysis?
What error rate is acceptable?
What is value per successful completion?

2. Manual Validation

Test one example with target model
Evaluate output quality and format
Identify failure modes
Estimate tokens per item

3. Architecture Selection

Single pipeline vs multi-agent
Required tools and data sources
Storage and caching strategy
Parallelization approach

4. Cost Estimation

Items × tokens × price
Development time
Infrastructure requirements
Ongoing operational costs

5. Development Plan

Stage-by-stage implementation
Testing strategy per stage
Iteration milestones
Deployment approach

Anti-Patterns to Avoid

Anti-Pattern	Problem	Solution
Skipping manual validation	Wasted development time	Always prototype first
Monolithic pipelines	Hard to debug/iterate	Separate stages with persistent outputs
Over-constraining the model	Scaffolding becomes liability	Test if scaffolding helps or hurts
Ignoring costs until production	Unexpected bills at scale	Estimate and track from beginning
Perfect parsing requirements	LLMs aren't deterministic	Build robust parsers
Premature optimization	Complexity before correctness	Get basic pipeline working first

Guidelines

Validate task-model fit with manual prototyping before automation
Structure pipelines as discrete, idempotent, cacheable stages
Use the file system for state management and debugging
Design prompts for structured, parseable outputs with format examples
Start with minimal architecture; add complexity only when proven necessary
Estimate costs early and track throughout development
Build robust parsers that handle LLM output variations
Expect and plan for multiple architectural iterations
Test whether scaffolding helps or constrains model performance
Use agent-assisted development for rapid iteration

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: project-development

Completed:
- [x] Task-model fit validated via manual prototype
- [x] Pipeline architecture designed (5 stages)
- [x] File system state management implemented
- [x] Cost estimation completed with buffer
- [x] Architecture selection documented

Outputs:
- Task-model fit assessment report
- 5-stage pipeline: acquire → prepare → process → parse → render
- File system structure: data/{id}/{stage}_output.json
- Cost estimate: items × tokens × price + 20-30% buffer
- Architecture decision: single-agent vs multi-agent with rationale

Completion Checklist

Before marking this skill as complete, verify:

Manual prototype tested with representative input
Task-model fit criteria evaluated (synthesis, judgment, NL output, error tolerance)
Pipeline stages discrete, idempotent, cacheable, independent
File system state management implemented (existence gates execution)
Structured output format specified in prompts
Cost estimation includes items, tokens, pricing, buffer
Single vs multi-agent decision documented
Architecture follows reduction principle (minimal complexity)

Failure Indicators

This skill has FAILED if:

❌ Skipped manual prototype validation
❌ Task unsuited for LLM (deterministic algorithm better)
❌ Monolithic pipeline (no discrete stages)
❌ In-memory state only (no file system persistence)
❌ No structured output specification
❌ Cost estimation missing or unrealistic
❌ Over-engineered architecture before validation
❌ Ignored task-model fit criteria

When NOT to Use

Do NOT use this skill when:

Traditional non-LLM software projects (deterministic algorithms)
Precise computation required (math, counting)
Real-time requirements (LLM latency too high)
Perfect accuracy needed (hallucination risk)
Proprietary data only (model lacks context)
Sequential dependencies (each step depends on previous)
Deterministic output required (same input = identical output)

Use alternative approaches when:

Need precision → Traditional algorithms
Need real-time → Cached LLM or deterministic logic
Need perfect accuracy → Rule-based systems
Need proprietary context → RAG with embedding
Need sequential → State machine or workflow engine

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Skipping manual validation	Wasted development time	Always prototype with 1 example first
Monolithic pipelines	Hard to debug, iterate	Separate into 5 discrete stages
Over-constraining model	Scaffolding becomes liability	Test if scaffolding helps or hurts
Ignoring costs until production	Unexpected bills at scale	Estimate items × tokens × price early
Perfect parsing requirements	LLMs aren't deterministic	Build robust parsers with defaults
Premature optimization	Complexity before correctness	Get basic pipeline working first
Database for state	Complexity without benefit	Use file system for state tracking
Assuming model knowledge	Hallucinations undetected	Validate proprietary data requirements

Principles

This skill embodies these CODITECT principles:

#1 Recycle → Extend → Re-Use → Create

Reuse canonical 5-stage pipeline pattern
Extend with domain-specific stages
Create only when pattern doesn't fit

#2 First Principles

Manual prototype validates core assumption
Task-model fit criteria based on LLM strengths
File system state = natural idempotency

#3 Keep It Simple

Start with minimal architecture
Add complexity only when proven necessary
Single-agent until multi-agent needed

#9 Automation First

Staged pipeline enables automation
File system gates prevent re-execution
Idempotent stages allow retries

Cost Awareness

Early estimation prevents budget surprises
Token budget tracked per stage
20-30% buffer for retries/failures

Skills

context-fundamentals - Context constraints for prompt design
tool-design - Designing tools for pipelines
multi-agent-patterns - When to use multi-agent
advanced-evaluation - Evaluating pipeline outputs

Agents

multi-agent-coordinator - Coordinate multi-agent pipelines

Commands

/batch-pipeline - Manage staged batch processing

Scripts

external/Agent-Skills-for-Context-Engineering/skills/project-development/scripts/pipeline_template.py - 5-stage pipeline implementation

When to Use​

Task-Model Fit Recognition​

LLM-Suited Tasks​

LLM-Unsuited Tasks​

The Manual Prototype Step​

What Manual Prototyping Answers​

Pipeline Architecture​

The Canonical Pipeline​

File System as State Machine​

Benefits​

State Check Pattern​

Structured Output Design​

Effective Structure Specification​

Parsing Best Practices​

Cost and Scale Estimation​

Formula​

Estimation Process​

Cost Reduction Strategies​

Single vs Multi-Agent Architecture​

Single-Agent Pipelines Work For​

Multi-Agent Architectures Work For​

Architectural Reduction​

Evidence: Vercel d0​

Build for Change​

Project Planning Template​

1. Task Analysis​

2. Manual Validation​

3. Architecture Selection​

4. Cost Estimation​

5. Development Plan​

Anti-Patterns to Avoid​

Guidelines​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​

Related Components​

Skills​

Agents​

Commands​

Scripts​