Project Development Methodology
Project Development Methodology
This skill covers principles for identifying tasks suited to LLM processing, designing effective project architectures, and iterating rapidly using agent-assisted development. The methodology applies whether building a batch processing pipeline, a multi-agent research system, or an interactive agent application.
When to Use
✅ Use this skill when:
- Starting a new project that might benefit from LLM processing
- Evaluating whether a task is well-suited for agents vs traditional code
- Designing architecture for an LLM-powered application
- Planning a batch processing pipeline with structured outputs
- Choosing between single-agent and multi-agent approaches
- Estimating costs and timelines for LLM-heavy projects
❌ Don't use this skill when:
- Traditional non-LLM software projects
- Simple single-query applications
- Tasks with deterministic algorithms available
Task-Model Fit Recognition
Not every problem benefits from LLM processing. Evaluate task-model fit before writing any code.
LLM-Suited Tasks
| Characteristic | Why It Fits |
|---|---|
| Synthesis across sources | LLMs excel at combining information |
| Subjective judgment with rubrics | Grading, evaluation, classification |
| Natural language output | Human-readable text goals |
| Error tolerance | Individual failures don't break system |
| Batch processing | No conversational state between items |
| Domain knowledge in training | Model has relevant context |
LLM-Unsuited Tasks
| Characteristic | Why It Fails |
|---|---|
| Precise computation | Math, counting unreliable |
| Real-time requirements | LLM latency too high |
| Perfect accuracy requirements | Hallucination risk |
| Proprietary data dependence | Model lacks context |
| Sequential dependencies | Each step depends on previous |
| Deterministic output requirements | Same input must = identical output |
The Manual Prototype Step
Before investing in automation, validate with a manual test:
- Copy one representative input into model interface
- Evaluate output quality
- This takes minutes and prevents hours of wasted development
What Manual Prototyping Answers
- Does the model have required knowledge?
- Can it produce output in needed format?
- What quality level to expect at scale?
- Are there obvious failure modes?
If the manual prototype fails, the automated system will fail.
Pipeline Architecture
LLM projects benefit from staged pipeline architectures where each stage is:
| Property | Description |
|---|---|
| Discrete | Clear boundaries between stages |
| Idempotent | Re-running produces same result |
| Cacheable | Intermediate results persist to disk |
| Independent | Each stage can run separately |
The Canonical Pipeline
acquire → prepare → process → parse → render
| Stage | Function | Deterministic | Cost |
|---|---|---|---|
| Acquire | Fetch raw data | Yes | Low |
| Prepare | Transform to prompts | Yes | Low |
| Process | Execute LLM calls | No | High |
| Parse | Extract structured data | Yes | Low |
| Render | Generate final outputs | Yes | Low |
Key insight: Only stage 3 is non-deterministic and expensive. Separate it to re-run only when necessary.
File System as State Machine
Use the file system to track pipeline state instead of databases or in-memory structures.
data/{id}/
├── raw.json # acquire stage complete
├── prompt.md # prepare stage complete
├── response.md # process stage complete
├── parsed.json # parse stage complete
Benefits
| Benefit | Description |
|---|---|
| Natural idempotency | File existence gates execution |
| Easy debugging | All state is human-readable |
| Simple parallelization | Each directory is independent |
| Trivial caching | Files persist across runs |
State Check Pattern
def needs_processing(item_id: str, stage: str) -> bool:
output_file = f"data/{item_id}/{stage}_output.json"
return not os.path.exists(output_file)
Structured Output Design
When LLM outputs must be parsed programmatically, prompt design determines parsing reliability.
Effective Structure Specification
Analyze the following and provide response in exactly this format:
## Summary
[Your summary here]
## Score
Rating: [1-10]
## Details
- Key point 1
- Key point 2
Follow this format exactly because I will be parsing it programmatically.
Parsing Best Practices
- Use regex patterns flexible enough for minor variations
- Provide sensible defaults when sections missing
- Log parsing failures for review rather than crashing
Cost and Scale Estimation
LLM processing has predictable costs. Estimate before starting.
Formula
Total cost = (items × tokens_per_item × price_per_token) + API overhead
Estimation Process
- Estimate input tokens per item (prompt + context)
- Estimate output tokens per item (typical response length)
- Multiply by item count
- Add 20-30% buffer for retries and failures
Cost Reduction Strategies
| Strategy | Impact |
|---|---|
| Reduce context length | Direct token savings |
| Use smaller models for simple items | Lower per-token cost |
| Cache and reuse partial results | Avoid redundant calls |
| Parallel processing | Wall-clock time (not token cost) |
Single vs Multi-Agent Architecture
Single-Agent Pipelines Work For
- Batch processing with independent items
- Tasks where items don't interact
- Simpler cost and complexity management
Multi-Agent Architectures Work For
- Parallel exploration of different aspects
- Tasks exceeding single context window
- When specialized sub-agents improve quality
Primary reason for multi-agent is context isolation, not role anthropomorphization.
Architectural Reduction
Start with minimal architecture. Add complexity only when proven necessary.
Evidence: Vercel d0
- Before: 17 specialized tools, 80% success, 274s execution
- After: 2 primitives (bash + SQL), 100% success, 77s execution
Build for Change
- Keep architecture simple and unopinionated
- Test across model strengths
- Design systems that benefit from model improvements
Project Planning Template
1. Task Analysis
- What is input/output?
- Is this synthesis, generation, classification, or analysis?
- What error rate is acceptable?
- What is value per successful completion?
2. Manual Validation
- Test one example with target model
- Evaluate output quality and format
- Identify failure modes
- Estimate tokens per item
3. Architecture Selection
- Single pipeline vs multi-agent
- Required tools and data sources
- Storage and caching strategy
- Parallelization approach
4. Cost Estimation
- Items × tokens × price
- Development time
- Infrastructure requirements
- Ongoing operational costs
5. Development Plan
- Stage-by-stage implementation
- Testing strategy per stage
- Iteration milestones
- Deployment approach
Anti-Patterns to Avoid
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Skipping manual validation | Wasted development time | Always prototype first |
| Monolithic pipelines | Hard to debug/iterate | Separate stages with persistent outputs |
| Over-constraining the model | Scaffolding becomes liability | Test if scaffolding helps or hurts |
| Ignoring costs until production | Unexpected bills at scale | Estimate and track from beginning |
| Perfect parsing requirements | LLMs aren't deterministic | Build robust parsers |
| Premature optimization | Complexity before correctness | Get basic pipeline working first |
Guidelines
- Validate task-model fit with manual prototyping before automation
- Structure pipelines as discrete, idempotent, cacheable stages
- Use the file system for state management and debugging
- Design prompts for structured, parseable outputs with format examples
- Start with minimal architecture; add complexity only when proven necessary
- Estimate costs early and track throughout development
- Build robust parsers that handle LLM output variations
- Expect and plan for multiple architectural iterations
- Test whether scaffolding helps or constrains model performance
- Use agent-assisted development for rapid iteration
Success Output
When successful, this skill MUST output:
✅ SKILL COMPLETE: project-development
Completed:
- [x] Task-model fit validated via manual prototype
- [x] Pipeline architecture designed (5 stages)
- [x] File system state management implemented
- [x] Cost estimation completed with buffer
- [x] Architecture selection documented
Outputs:
- Task-model fit assessment report
- 5-stage pipeline: acquire → prepare → process → parse → render
- File system structure: data/{id}/{stage}_output.json
- Cost estimate: items × tokens × price + 20-30% buffer
- Architecture decision: single-agent vs multi-agent with rationale
Completion Checklist
Before marking this skill as complete, verify:
- Manual prototype tested with representative input
- Task-model fit criteria evaluated (synthesis, judgment, NL output, error tolerance)
- Pipeline stages discrete, idempotent, cacheable, independent
- File system state management implemented (existence gates execution)
- Structured output format specified in prompts
- Cost estimation includes items, tokens, pricing, buffer
- Single vs multi-agent decision documented
- Architecture follows reduction principle (minimal complexity)
Failure Indicators
This skill has FAILED if:
- ❌ Skipped manual prototype validation
- ❌ Task unsuited for LLM (deterministic algorithm better)
- ❌ Monolithic pipeline (no discrete stages)
- ❌ In-memory state only (no file system persistence)
- ❌ No structured output specification
- ❌ Cost estimation missing or unrealistic
- ❌ Over-engineered architecture before validation
- ❌ Ignored task-model fit criteria
When NOT to Use
Do NOT use this skill when:
- Traditional non-LLM software projects (deterministic algorithms)
- Precise computation required (math, counting)
- Real-time requirements (LLM latency too high)
- Perfect accuracy needed (hallucination risk)
- Proprietary data only (model lacks context)
- Sequential dependencies (each step depends on previous)
- Deterministic output required (same input = identical output)
Use alternative approaches when:
- Need precision → Traditional algorithms
- Need real-time → Cached LLM or deterministic logic
- Need perfect accuracy → Rule-based systems
- Need proprietary context → RAG with embedding
- Need sequential → State machine or workflow engine
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Skipping manual validation | Wasted development time | Always prototype with 1 example first |
| Monolithic pipelines | Hard to debug, iterate | Separate into 5 discrete stages |
| Over-constraining model | Scaffolding becomes liability | Test if scaffolding helps or hurts |
| Ignoring costs until production | Unexpected bills at scale | Estimate items × tokens × price early |
| Perfect parsing requirements | LLMs aren't deterministic | Build robust parsers with defaults |
| Premature optimization | Complexity before correctness | Get basic pipeline working first |
| Database for state | Complexity without benefit | Use file system for state tracking |
| Assuming model knowledge | Hallucinations undetected | Validate proprietary data requirements |
Principles
This skill embodies these CODITECT principles:
#1 Recycle → Extend → Re-Use → Create
- Reuse canonical 5-stage pipeline pattern
- Extend with domain-specific stages
- Create only when pattern doesn't fit
#2 First Principles
- Manual prototype validates core assumption
- Task-model fit criteria based on LLM strengths
- File system state = natural idempotency
#3 Keep It Simple
- Start with minimal architecture
- Add complexity only when proven necessary
- Single-agent until multi-agent needed
#9 Automation First
- Staged pipeline enables automation
- File system gates prevent re-execution
- Idempotent stages allow retries
Cost Awareness
- Early estimation prevents budget surprises
- Token budget tracked per stage
- 20-30% buffer for retries/failures
Related Components
Skills
context-fundamentals- Context constraints for prompt designtool-design- Designing tools for pipelinesmulti-agent-patterns- When to use multi-agentadvanced-evaluation- Evaluating pipeline outputs
Agents
multi-agent-coordinator- Coordinate multi-agent pipelines
Commands
/batch-pipeline- Manage staged batch processing
Scripts
external/Agent-Skills-for-Context-Engineering/skills/project-development/scripts/pipeline_template.py- 5-stage pipeline implementation