Skip to main content

Claude Code: Autonomous Workflow Baseline

Implementation Status: ✅ Currently Implemented
Analysis Date: 2025-10-14

Overview

Claude Code serves as the baseline implementation for autonomous RESEARCH → PLAN → CODE workflows in HumanLayer. This analysis documents the current working system to serve as a reference for implementing similar capabilities in other llms.

Current Architecture

CLAUDE.md Configuration System

# Example project CLAUDE.md
## 🤖 Autonomous Development Mode

### When to Activate Autonomous Mode
Use autonomous mode when:
- Building full-stack applications or multi-module systems
- Tasks require 500+ lines of code or 2+ hours of work
- Creating production-grade software with multiple integrations

### Key Principles
#### 1. Four Operating Modes
- **DELIBERATION**: Analyze requirements, decompose tasks, identify gaps (NO code)
- **RESEARCH**: Execute focused tool calls to verify assumptions (5-20 calls)
- **PLAN**: Synthesize research into detailed implementation specification
- **ACTION**: Emit working code artifacts following strict rules

MCP Integration (hlyr/src/mcp.ts)

// Current Claude Code integration
export async function startClaudeApprovalsMCPServer() {
const server = new Server(
{ name: 'humanlayer-claude-local-approvals', version: '1.0.0' },
{ capabilities: { tools: {} } }
)

// Register approval tool
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [{
name: 'request_permission',
description: 'Request permission to perform an action',
inputSchema: {
type: 'object',
properties: {
tool_name: { type: 'string' },
input: { type: 'object' },
tool_use_id: { type: 'string' },
},
required: ['tool_name', 'input', 'tool_use_id'],
},
}]
}))
}

Four-Phase Workflow Implementation

Phase 1: DELIBERATION

Current Implementation: Built into Claude Code's reasoning capabilities

// Session launch with deliberation mode
const session = await daemonClient.createSession({
provider: 'claude',
model: 'sonnet',
directory: './project',
query: 'Build a task management application',
mode: 'deliberation' // Analysis phase
})

Autonomous System Prompt Pattern:

ANALYSIS:
- Core requirements: [list 3-5 essential capabilities]
- Technical constraints: [runtime, dependencies, scale]
- Unknown factors: [what needs research]

DECOMPOSITION:
Phase 1: [foundation - data models, core abstractions]
Phase 2: [business logic - core features]
Phase 3: [integration - APIs, persistence, UI]
Phase 4: [polish - error handling, observability]

Phase 2: RESEARCH

Tool Call Budget: 5-20 calls per research phase

// Research phase management in session manager
type ResearchPhase struct {
MaxToolCalls int `json:"max_tool_calls"`
CallsUsed int `json:"calls_used"`
Findings []Finding `json:"findings"`
}

// Tool usage tracking
func (s *Session) ExecuteTool(toolCall ToolCall) (*ToolResult, error) {
// Check approval if required
if s.RequiresApproval(toolCall.Name) {
approval, err := s.requestApproval(toolCall)
if err != nil || approval.Decision == "deny" {
return nil, fmt.Errorf("tool execution denied: %s", approval.Reason)
}
}

// Execute tool and track usage
result, err := s.toolExecutor.Execute(toolCall)
s.research.CallsUsed++
return result, err
}

Research Protocol Implementation:

// Research cycle implementation
class ResearchCycle {
maxToolCalls: number = 20

async execute(knowledgeGap: string): Promise<TechDecision> {
// 1. Broad exploration
const broadResults = await this.search(this.extractCoreTerms(knowledgeGap))

// 2. Narrow to specifics
const specificResults = await this.search(knowledgeGap + " 2025")

// 3. Fetch authoritative sources
const primaryDocs = await this.fetch(this.findOfficialDocs(broadResults))

// 4. Validate with cross-reference
const validation = await this.search(knowledgeGap + " comparison")

// 5. Synthesize decision
return new TechDecision({
chosenApproach: this.synthesize(primaryDocs),
rationale: this.generateRationale(validation),
alternativesConsidered: this.extractAlternatives(broadResults),
sources: [...primaryDocs, ...validation]
})
}
}

Phase 3: PLAN

Critical Bridge Phase: Synthesizes research findings into actionable implementation specification

// Planning phase management
type PlanningPhase struct {
ResearchFindings []TechDecision `json:"research_findings"`
Architecture *SystemDesign `json:"architecture"`
Implementation *ImplementationPlan `json:"implementation"`
Approvals *ApprovalStrategy `json:"approvals"`
Validation *ValidationPlan `json:"validation"`
}

type ImplementationPlan struct {
Phases []ImplementationPhase `json:"phases"`
Dependencies []Dependency `json:"dependencies"`
ApprovalGates []ApprovalGate `json:"approval_gates"`
RiskMitigation []RiskMitigation `json:"risk_mitigation"`
SuccessCriteria []SuccessCriterion `json:"success_criteria"`
}

type ImplementationPhase struct {
Name string `json:"name"`
Description string `json:"description"`
Artifacts []ArtifactSpec `json:"artifacts"`
Dependencies []string `json:"dependencies"`
EstimatedLines int `json:"estimated_lines"`
RiskLevel string `json:"risk_level"`
ApprovalRequired bool `json:"approval_required"`
}

Planning Protocol Implementation:

class PlanningPhase {
async synthesizeResearchIntoImplementationPlan(
deliberation: DeliberationResult,
research: ResearchResult[]
): Promise<ImplementationPlan> {

// 1. Synthesize technical decisions from research
const techDecisions = this.synthesizeTechDecisions(research)

// 2. Design system architecture
const architecture = await this.createSystemDesign({
requirements: deliberation.requirements,
constraints: deliberation.constraints,
techDecisions: techDecisions
})

// 3. Break down into implementation phases
const phases = this.createImplementationPhases(architecture)

// 4. Plan approval gates and risk mitigation
const approvalStrategy = this.planApprovalGates(phases)

// 5. Define success criteria and validation
const validation = this.createValidationPlan(phases)

return {
phases: phases,
dependencies: this.mapDependencies(phases),
approvalGates: approvalStrategy.gates,
riskMitigation: this.identifyRisks(phases),
successCriteria: validation.criteria,
estimatedEffort: this.calculateEffort(phases),
timeline: this.createTimeline(phases)
}
}

// Create detailed implementation phases with dependencies
createImplementationPhases(architecture: SystemDesign): ImplementationPhase[] {
return [
{
name: "Foundation",
description: "Core data models, database schema, basic abstractions",
artifacts: [
{ name: "database-schema.sql", type: "schema", lines: 50 },
{ name: "core-models.ts", type: "types", lines: 100 },
{ name: "base-api.ts", type: "api", lines: 150 }
],
dependencies: [],
estimatedLines: 300,
riskLevel: "low",
approvalRequired: true
},
{
name: "Business Logic",
description: "Core application features and business rules",
artifacts: [
{ name: "user-service.ts", type: "service", lines: 200 },
{ name: "auth-middleware.ts", type: "middleware", lines: 100 },
{ name: "business-logic.ts", type: "logic", lines: 300 }
],
dependencies: ["Foundation"],
estimatedLines: 600,
riskLevel: "medium",
approvalRequired: true
},
{
name: "Integration",
description: "External APIs, persistence, real-time features",
artifacts: [
{ name: "websocket-server.ts", type: "integration", lines: 250 },
{ name: "external-apis.ts", type: "integration", lines: 200 },
{ name: "database-integration.ts", type: "persistence", lines: 150 }
],
dependencies: ["Foundation", "Business Logic"],
estimatedLines: 600,
riskLevel: "high",
approvalRequired: true
},
{
name: "UI Components",
description: "React components, state management, user interfaces",
artifacts: [
{ name: "app.tsx", type: "component", lines: 100 },
{ name: "TaskManager.tsx", type: "component", lines: 200 },
{ name: "useRealtime.ts", type: "hook", lines: 100 }
],
dependencies: ["Integration"],
estimatedLines: 400,
riskLevel: "medium",
approvalRequired: true
},
{
name: "Polish",
description: "Error handling, testing, observability, deployment",
artifacts: [
{ name: "error-handling.ts", type: "utility", lines: 150 },
{ name: "tests.spec.ts", type: "test", lines: 300 },
{ name: "deployment-config.yaml", type: "config", lines: 50 }
],
dependencies: ["UI Components"],
estimatedLines: 500,
riskLevel: "low",
approvalRequired: false
}
]
}
}

PLAN Phase Approval Template:

IMPLEMENTATION PLAN APPROVAL REQUEST

## Architecture Overview
[Generated system design diagram and component relationships]

## Implementation Strategy
Phase 1: Foundation (300 lines, low risk)
Phase 2: Business Logic (600 lines, medium risk)
Phase 3: Integration (600 lines, high risk)
Phase 4: UI Components (400 lines, medium risk)
Phase 5: Polish (500 lines, low risk)

## Approval Gates
- Phase 1→2: Database schema and core models approval
- Phase 2→3: Business logic and security review
- Phase 3→4: Integration testing and performance validation
- Phase 4→5: UI/UX review and user testing

## Risk Mitigation
- High-risk integration phase will include incremental testing
- Database changes will be versioned and reversible
- WebSocket implementation includes fallback mechanisms
- Error handling and logging throughout all phases

## Success Criteria
- All tests pass (unit, integration, E2E)
- Performance meets requirements (< 100ms API response)
- Security review passes (auth, data validation)
- User acceptance criteria met

REQUEST APPROVAL TO PROCEED TO ACTION PHASE

Phase 4: ACTION (CODE)

Artifact Management System:

// Artifact rules implementation
type ArtifactManager struct {
sizeThreshold int // 20 lines or 1500 chars
maxUpdates int // 4 consecutive updates before rewrite
updateCount map[string]int
artifacts map[string]*Artifact
}

func (am *ArtifactManager) ShouldCreateArtifact(code string) bool {
return len(strings.Split(code, "\n")) > 20 || len(code) > 1500
}

func (am *ArtifactManager) DecideStrategy(artifactID string, changeScope ChangeScope) Strategy {
if changeScope.AffectedLines <= 20 &&
changeScope.AffectedLocations <= 5 &&
!changeScope.StructuralChange {
return StrategyUpdate
}

if am.updateCount[artifactID] >= 4 {
return StrategyRewrite
}

return StrategyRewrite // Default to safety
}

State Management Architecture

Session State Manifest

type SessionState struct {
Phase AutonomousPhase `json:"phase"` // DELIBERATION, RESEARCH, PLAN, ACTION
CompletedPhases []string `json:"completed_phases"`
ActiveArtifact string `json:"active_artifact"`
IterationCount int `json:"iteration_count"`
KnownIssues []Issue `json:"known_issues"`
IntegrationStatus map[string]bool `json:"integration_status"`
NextSteps []string `json:"next_planned_steps"`

// PLAN phase specific state
ImplementationPlan *ImplementationPlan `json:"implementation_plan,omitempty"`
CurrentPhase *ImplementationPhase `json:"current_phase,omitempty"`
ApprovalGatesHit []string `json:"approval_gates_hit"`
}

// State tracking in every response
const stateManifest = `
CURRENT STATE:
- Phase: ${session.phase} // DELIBERATION/RESEARCH/PLAN/ACTION
- Implementation Phase: ${session.currentPhase?.name || 'N/A'}
- Active artifact: ${session.activeArtifact}
- Iteration: ${session.iterationCount}
- Integration: ${session.integrationStatus}
- Next: ${session.nextSteps}
- Approval Gates: ${session.approvalGatesHit?.length || 0} completed
`

Error Ritual Implementation

type ErrorRitual struct {
ErrorHistory []ProcessedError `json:"error_history"`
Lessons []Lesson `json:"lessons_learned"`
}

func (er *ErrorRitual) ProcessError(err error) (*RecoveryPlan, error) {
// 1. STOP - Do not continue broken iteration
er.haltCurrentIteration()

// 2. ANALYZE - What caused the error?
analysis := er.analyzeError(err)

// 3. EXTRACT - What lesson prevents recurrence?
lesson := er.extractLesson(analysis)
er.Lessons = append(er.Lessons, lesson)

// 4. CLEAN - Remove ghost context
er.cleanStaleAssumptions(analysis.StaleAssumptions)

// 5. RETRY - Fresh attempt with distilled knowledge
return er.createRecoveryPlan(lesson), nil
}

Tool Integration Patterns

Permission-Based Tool Execution

// Tool execution with approval workflow
async function executeWithApproval(toolCall: ToolCall): Promise<ToolResult> {
if (DANGEROUS_TOOLS.includes(toolCall.name)) {
const approval = await requestPermission({
tool_name: toolCall.name,
input: toolCall.parameters,
tool_use_id: toolCall.id,
explanation: `About to execute ${toolCall.name} with parameters: ${JSON.stringify(toolCall.parameters, null, 2)}`
})

if (approval.decision !== 'approve') {
throw new Error(`Tool execution denied: ${approval.explanation}`)
}
}

return await executeTool(toolCall)
}

Context Engineering Integration

// Context management for long-horizon development
type ContextManager struct {
tokenBudget int
usedTokens int
compressionRatio float64
artifacts []Artifact
}

func (cm *ContextManager) ManageContext() error {
if cm.usedTokens > cm.tokenBudget*0.8 {
// Compress context every 10 iterations
if err := cm.compressHistory(); err != nil {
return fmt.Errorf("context compression failed: %w", err)
}

// Clear tool call results from deep history
cm.clearDeepToolResults()

// Preserve critical decisions and implementations
cm.preserveCriticalContext()
}
return nil
}

Workflow Coordination

Session Orchestration

type WorkflowOrchestrator struct {
sessionManager *SessionManager
approvalManager *ApprovalManager
stateTracker *StateTracker
}

func (wo *WorkflowOrchestrator) ExecuteAutonomousWorkflow(req WorkflowRequest) (*WorkflowResult, error) {
// Phase 1: DELIBERATION
session, err := wo.sessionManager.StartDeliberationPhase(req)
if err != nil {
return nil, fmt.Errorf("deliberation phase failed: %w", err)
}

deliberationResult := wo.waitForPhaseCompletion(session, PhaseDeliberation)

// Phase 2: RESEARCH (if knowledge gaps identified)
if len(deliberationResult.KnowledgeGaps) > 0 {
researchResult, err := wo.executeResearchPhase(session, deliberationResult.KnowledgeGaps)
if err != nil {
return nil, fmt.Errorf("research phase failed: %w", err)
}
session.UpdateResearchFindings(researchResult)
}

// Phase 3: ACTION
actionResult, err := wo.executeActionPhase(session)
if err != nil {
return nil, fmt.Errorf("action phase failed: %w", err)
}

return &WorkflowResult{
Deliberation: deliberationResult,
Research: researchResult,
Action: actionResult,
Artifacts: session.GetArtifacts(),
State: session.GetFinalState(),
}, nil
}

Success Patterns

Quality Gates

# Pre-artifact checklist implementation
quality_gates:
pre_artifact:
- size_exceeds_threshold: true
- type_supported: true
- dependencies_imported: true
- error_handling_included: true
- integration_points_defined: true

post_iteration:
- state_manifest_updated: true
- iteration_count_incremented: true
- known_issues_documented: true
- next_steps_identified: true
- checkpoint_saved: true # every 10 iterations

Performance Characteristics

// Current Claude Code performance metrics
type PerformanceMetrics struct {
DeliberationTime time.Duration // ~2-5 minutes
ResearchTime time.Duration // ~5-15 minutes
PlanningTime time.Duration // ~3-10 minutes
ActionTime time.Duration // ~10-60 minutes per artifact
TokensPerIteration int // ~2000-8000 tokens
IterationsPerHour int // ~6-12 iterations
SuccessRate float64 // ~85-95% for well-defined tasks
}

This baseline implementation provides the foundation for autonomous development workflows that can be adapted to other llms while maintaining the core RESEARCH → PLAN → CODE methodology and human oversight capabilities.