Claude Code: Autonomous Workflow Baseline

Implementation Status: ✅ Currently Implemented
Analysis Date: 2025-10-14

Overview

Claude Code serves as the baseline implementation for autonomous RESEARCH → PLAN → CODE workflows in HumanLayer. This analysis documents the current working system to serve as a reference for implementing similar capabilities in other llms.

Current Architecture

CLAUDE.md Configuration System

# Example project CLAUDE.md
## 🤖 Autonomous Development Mode

### When to Activate Autonomous Mode
Use autonomous mode when:
- Building full-stack applications or multi-module systems
- Tasks require 500+ lines of code or 2+ hours of work
- Creating production-grade software with multiple integrations

### Key Principles
#### 1. Four Operating Modes
- **DELIBERATION**: Analyze requirements, decompose tasks, identify gaps (NO code)
- **RESEARCH**: Execute focused tool calls to verify assumptions (5-20 calls)
- **PLAN**: Synthesize research into detailed implementation specification
- **ACTION**: Emit working code artifacts following strict rules

MCP Integration (`hlyr/src/mcp.ts`)

// Current Claude Code integration
export async function startClaudeApprovalsMCPServer() {
  const server = new Server(
    { name: 'humanlayer-claude-local-approvals', version: '1.0.0' },
    { capabilities: { tools: {} } }
  )

  // Register approval tool
  server.setRequestHandler(ListToolsRequestSchema, async () => ({
    tools: [{
      name: 'request_permission',
      description: 'Request permission to perform an action',
      inputSchema: {
        type: 'object',
        properties: {
          tool_name: { type: 'string' },
          input: { type: 'object' },
          tool_use_id: { type: 'string' },
        },
        required: ['tool_name', 'input', 'tool_use_id'],
      },
    }]
  }))
}

Four-Phase Workflow Implementation

Phase 1: DELIBERATION

Current Implementation: Built into Claude Code's reasoning capabilities

// Session launch with deliberation mode
const session = await daemonClient.createSession({
  provider: 'claude',
  model: 'sonnet',
  directory: './project',
  query: 'Build a task management application',
  mode: 'deliberation' // Analysis phase
})

Autonomous System Prompt Pattern:

ANALYSIS:
- Core requirements: [list 3-5 essential capabilities]
- Technical constraints: [runtime, dependencies, scale]
- Unknown factors: [what needs research]

DECOMPOSITION:
Phase 1: [foundation - data models, core abstractions]
Phase 2: [business logic - core features]  
Phase 3: [integration - APIs, persistence, UI]
Phase 4: [polish - error handling, observability]

Phase 2: RESEARCH

Tool Call Budget: 5-20 calls per research phase

// Research phase management in session manager
type ResearchPhase struct {
    MaxToolCalls int     `json:"max_tool_calls"`
    CallsUsed    int     `json:"calls_used"`
    Findings     []Finding `json:"findings"`
}

// Tool usage tracking
func (s *Session) ExecuteTool(toolCall ToolCall) (*ToolResult, error) {
    // Check approval if required
    if s.RequiresApproval(toolCall.Name) {
        approval, err := s.requestApproval(toolCall)
        if err != nil || approval.Decision == "deny" {
            return nil, fmt.Errorf("tool execution denied: %s", approval.Reason)
        }
    }
    
    // Execute tool and track usage
    result, err := s.toolExecutor.Execute(toolCall)
    s.research.CallsUsed++
    return result, err
}

Research Protocol Implementation:

// Research cycle implementation
class ResearchCycle {
  maxToolCalls: number = 20
  
  async execute(knowledgeGap: string): Promise<TechDecision> {
    // 1. Broad exploration
    const broadResults = await this.search(this.extractCoreTerms(knowledgeGap))
    
    // 2. Narrow to specifics
    const specificResults = await this.search(knowledgeGap + " 2025")
    
    // 3. Fetch authoritative sources
    const primaryDocs = await this.fetch(this.findOfficialDocs(broadResults))
    
    // 4. Validate with cross-reference
    const validation = await this.search(knowledgeGap + " comparison")
    
    // 5. Synthesize decision
    return new TechDecision({
      chosenApproach: this.synthesize(primaryDocs),
      rationale: this.generateRationale(validation),
      alternativesConsidered: this.extractAlternatives(broadResults),
      sources: [...primaryDocs, ...validation]
    })
  }
}

Phase 3: PLAN

Critical Bridge Phase: Synthesizes research findings into actionable implementation specification

// Planning phase management
type PlanningPhase struct {
    ResearchFindings  []TechDecision    `json:"research_findings"`
    Architecture      *SystemDesign     `json:"architecture"`
    Implementation    *ImplementationPlan `json:"implementation"`
    Approvals         *ApprovalStrategy `json:"approvals"`
    Validation        *ValidationPlan   `json:"validation"`
}

type ImplementationPlan struct {
    Phases            []ImplementationPhase `json:"phases"`
    Dependencies      []Dependency          `json:"dependencies"`
    ApprovalGates     []ApprovalGate       `json:"approval_gates"`
    RiskMitigation    []RiskMitigation     `json:"risk_mitigation"`
    SuccessCriteria   []SuccessCriterion   `json:"success_criteria"`
}

type ImplementationPhase struct {
    Name              string               `json:"name"`
    Description       string               `json:"description"`
    Artifacts         []ArtifactSpec       `json:"artifacts"`
    Dependencies      []string             `json:"dependencies"`
    EstimatedLines    int                  `json:"estimated_lines"`
    RiskLevel         string               `json:"risk_level"`
    ApprovalRequired  bool                 `json:"approval_required"`
}

Planning Protocol Implementation:

class PlanningPhase {
  async synthesizeResearchIntoImplementationPlan(
    deliberation: DeliberationResult,
    research: ResearchResult[]
  ): Promise<ImplementationPlan> {
    
    // 1. Synthesize technical decisions from research
    const techDecisions = this.synthesizeTechDecisions(research)
    
    // 2. Design system architecture
    const architecture = await this.createSystemDesign({
      requirements: deliberation.requirements,
      constraints: deliberation.constraints,
      techDecisions: techDecisions
    })
    
    // 3. Break down into implementation phases
    const phases = this.createImplementationPhases(architecture)
    
    // 4. Plan approval gates and risk mitigation
    const approvalStrategy = this.planApprovalGates(phases)
    
    // 5. Define success criteria and validation
    const validation = this.createValidationPlan(phases)
    
    return {
      phases: phases,
      dependencies: this.mapDependencies(phases),
      approvalGates: approvalStrategy.gates,
      riskMitigation: this.identifyRisks(phases),
      successCriteria: validation.criteria,
      estimatedEffort: this.calculateEffort(phases),
      timeline: this.createTimeline(phases)
    }
  }
  
  // Create detailed implementation phases with dependencies
  createImplementationPhases(architecture: SystemDesign): ImplementationPhase[] {
    return [
      {
        name: "Foundation",
        description: "Core data models, database schema, basic abstractions",
        artifacts: [
          { name: "database-schema.sql", type: "schema", lines: 50 },
          { name: "core-models.ts", type: "types", lines: 100 },
          { name: "base-api.ts", type: "api", lines: 150 }
        ],
        dependencies: [],
        estimatedLines: 300,
        riskLevel: "low",
        approvalRequired: true
      },
      {
        name: "Business Logic",
        description: "Core application features and business rules",
        artifacts: [
          { name: "user-service.ts", type: "service", lines: 200 },
          { name: "auth-middleware.ts", type: "middleware", lines: 100 },
          { name: "business-logic.ts", type: "logic", lines: 300 }
        ],
        dependencies: ["Foundation"],
        estimatedLines: 600,
        riskLevel: "medium",
        approvalRequired: true
      },
      {
        name: "Integration",
        description: "External APIs, persistence, real-time features",
        artifacts: [
          { name: "websocket-server.ts", type: "integration", lines: 250 },
          { name: "external-apis.ts", type: "integration", lines: 200 },
          { name: "database-integration.ts", type: "persistence", lines: 150 }
        ],
        dependencies: ["Foundation", "Business Logic"],
        estimatedLines: 600,
        riskLevel: "high",
        approvalRequired: true
      },
      {
        name: "UI Components",
        description: "React components, state management, user interfaces",
        artifacts: [
          { name: "app.tsx", type: "component", lines: 100 },
          { name: "TaskManager.tsx", type: "component", lines: 200 },
          { name: "useRealtime.ts", type: "hook", lines: 100 }
        ],
        dependencies: ["Integration"],
        estimatedLines: 400,
        riskLevel: "medium", 
        approvalRequired: true
      },
      {
        name: "Polish",
        description: "Error handling, testing, observability, deployment",
        artifacts: [
          { name: "error-handling.ts", type: "utility", lines: 150 },
          { name: "tests.spec.ts", type: "test", lines: 300 },
          { name: "deployment-config.yaml", type: "config", lines: 50 }
        ],
        dependencies: ["UI Components"],
        estimatedLines: 500,
        riskLevel: "low",
        approvalRequired: false
      }
    ]
  }
}

PLAN Phase Approval Template:

IMPLEMENTATION PLAN APPROVAL REQUEST

## Architecture Overview
[Generated system design diagram and component relationships]

## Implementation Strategy
Phase 1: Foundation (300 lines, low risk)
Phase 2: Business Logic (600 lines, medium risk)  
Phase 3: Integration (600 lines, high risk)
Phase 4: UI Components (400 lines, medium risk)
Phase 5: Polish (500 lines, low risk)

## Approval Gates
- Phase 1→2: Database schema and core models approval
- Phase 2→3: Business logic and security review
- Phase 3→4: Integration testing and performance validation
- Phase 4→5: UI/UX review and user testing

## Risk Mitigation
- High-risk integration phase will include incremental testing
- Database changes will be versioned and reversible
- WebSocket implementation includes fallback mechanisms
- Error handling and logging throughout all phases

## Success Criteria
- All tests pass (unit, integration, E2E)
- Performance meets requirements (< 100ms API response)
- Security review passes (auth, data validation)
- User acceptance criteria met

REQUEST APPROVAL TO PROCEED TO ACTION PHASE

Phase 4: ACTION (CODE)

Artifact Management System:

// Artifact rules implementation
type ArtifactManager struct {
    sizeThreshold    int    // 20 lines or 1500 chars
    maxUpdates       int    // 4 consecutive updates before rewrite
    updateCount      map[string]int
    artifacts        map[string]*Artifact
}

func (am *ArtifactManager) ShouldCreateArtifact(code string) bool {
    return len(strings.Split(code, "\n")) > 20 || len(code) > 1500
}

func (am *ArtifactManager) DecideStrategy(artifactID string, changeScope ChangeScope) Strategy {
    if changeScope.AffectedLines <= 20 && 
       changeScope.AffectedLocations <= 5 && 
       !changeScope.StructuralChange {
        return StrategyUpdate
    }
    
    if am.updateCount[artifactID] >= 4 {
        return StrategyRewrite
    }
    
    return StrategyRewrite // Default to safety
}

State Management Architecture

Session State Manifest

type SessionState struct {
    Phase            AutonomousPhase `json:"phase"` // DELIBERATION, RESEARCH, PLAN, ACTION
    CompletedPhases  []string        `json:"completed_phases"`
    ActiveArtifact   string          `json:"active_artifact"`
    IterationCount   int             `json:"iteration_count"`
    KnownIssues      []Issue         `json:"known_issues"`
    IntegrationStatus map[string]bool `json:"integration_status"`
    NextSteps        []string        `json:"next_planned_steps"`
    
    // PLAN phase specific state
    ImplementationPlan *ImplementationPlan `json:"implementation_plan,omitempty"`
    CurrentPhase       *ImplementationPhase `json:"current_phase,omitempty"`
    ApprovalGatesHit   []string            `json:"approval_gates_hit"`
}

// State tracking in every response
const stateManifest = `
CURRENT STATE:
- Phase: ${session.phase} // DELIBERATION/RESEARCH/PLAN/ACTION
- Implementation Phase: ${session.currentPhase?.name || 'N/A'}
- Active artifact: ${session.activeArtifact}
- Iteration: ${session.iterationCount}
- Integration: ${session.integrationStatus}
- Next: ${session.nextSteps}
- Approval Gates: ${session.approvalGatesHit?.length || 0} completed
`

Error Ritual Implementation

type ErrorRitual struct {
    ErrorHistory []ProcessedError `json:"error_history"`
    Lessons      []Lesson         `json:"lessons_learned"`
}

func (er *ErrorRitual) ProcessError(err error) (*RecoveryPlan, error) {
    // 1. STOP - Do not continue broken iteration
    er.haltCurrentIteration()
    
    // 2. ANALYZE - What caused the error?
    analysis := er.analyzeError(err)
    
    // 3. EXTRACT - What lesson prevents recurrence?
    lesson := er.extractLesson(analysis)
    er.Lessons = append(er.Lessons, lesson)
    
    // 4. CLEAN - Remove ghost context
    er.cleanStaleAssumptions(analysis.StaleAssumptions)
    
    // 5. RETRY - Fresh attempt with distilled knowledge
    return er.createRecoveryPlan(lesson), nil
}

Tool Integration Patterns

Permission-Based Tool Execution

// Tool execution with approval workflow
async function executeWithApproval(toolCall: ToolCall): Promise<ToolResult> {
  if (DANGEROUS_TOOLS.includes(toolCall.name)) {
    const approval = await requestPermission({
      tool_name: toolCall.name,
      input: toolCall.parameters,
      tool_use_id: toolCall.id,
      explanation: `About to execute ${toolCall.name} with parameters: ${JSON.stringify(toolCall.parameters, null, 2)}`
    })
    
    if (approval.decision !== 'approve') {
      throw new Error(`Tool execution denied: ${approval.explanation}`)
    }
  }
  
  return await executeTool(toolCall)
}

Context Engineering Integration

// Context management for long-horizon development
type ContextManager struct {
    tokenBudget      int
    usedTokens       int
    compressionRatio float64
    artifacts        []Artifact
}

func (cm *ContextManager) ManageContext() error {
    if cm.usedTokens > cm.tokenBudget*0.8 {
        // Compress context every 10 iterations
        if err := cm.compressHistory(); err != nil {
            return fmt.Errorf("context compression failed: %w", err)
        }
        
        // Clear tool call results from deep history
        cm.clearDeepToolResults()
        
        // Preserve critical decisions and implementations
        cm.preserveCriticalContext()
    }
    return nil
}

Workflow Coordination

Session Orchestration

type WorkflowOrchestrator struct {
    sessionManager *SessionManager
    approvalManager *ApprovalManager
    stateTracker   *StateTracker
}

func (wo *WorkflowOrchestrator) ExecuteAutonomousWorkflow(req WorkflowRequest) (*WorkflowResult, error) {
    // Phase 1: DELIBERATION
    session, err := wo.sessionManager.StartDeliberationPhase(req)
    if err != nil {
        return nil, fmt.Errorf("deliberation phase failed: %w", err)
    }
    
    deliberationResult := wo.waitForPhaseCompletion(session, PhaseDeliberation)
    
    // Phase 2: RESEARCH (if knowledge gaps identified)
    if len(deliberationResult.KnowledgeGaps) > 0 {
        researchResult, err := wo.executeResearchPhase(session, deliberationResult.KnowledgeGaps)
        if err != nil {
            return nil, fmt.Errorf("research phase failed: %w", err)
        }
        session.UpdateResearchFindings(researchResult)
    }
    
    // Phase 3: ACTION
    actionResult, err := wo.executeActionPhase(session)
    if err != nil {
        return nil, fmt.Errorf("action phase failed: %w", err)
    }
    
    return &WorkflowResult{
        Deliberation: deliberationResult,
        Research:     researchResult, 
        Action:       actionResult,
        Artifacts:    session.GetArtifacts(),
        State:        session.GetFinalState(),
    }, nil
}

Success Patterns

Quality Gates

# Pre-artifact checklist implementation
quality_gates:
  pre_artifact:
    - size_exceeds_threshold: true
    - type_supported: true
    - dependencies_imported: true
    - error_handling_included: true
    - integration_points_defined: true
    
  post_iteration:
    - state_manifest_updated: true
    - iteration_count_incremented: true
    - known_issues_documented: true
    - next_steps_identified: true
    - checkpoint_saved: true # every 10 iterations

Performance Characteristics

// Current Claude Code performance metrics
type PerformanceMetrics struct {
    DeliberationTime   time.Duration // ~2-5 minutes
    ResearchTime       time.Duration // ~5-15 minutes
    PlanningTime       time.Duration // ~3-10 minutes  
    ActionTime         time.Duration // ~10-60 minutes per artifact
    TokensPerIteration int           // ~2000-8000 tokens
    IterationsPerHour  int           // ~6-12 iterations
    SuccessRate        float64       // ~85-95% for well-defined tasks
}

This baseline implementation provides the foundation for autonomous development workflows that can be adapted to other llms while maintaining the core RESEARCH → PLAN → CODE methodology and human oversight capabilities.

Overview​

Current Architecture​

CLAUDE.md Configuration System​

MCP Integration (hlyr/src/mcp.ts)​

Four-Phase Workflow Implementation​

Phase 1: DELIBERATION​

Phase 2: RESEARCH​

Phase 3: PLAN​

Phase 4: ACTION (CODE)​

State Management Architecture​

Session State Manifest​

Error Ritual Implementation​

Tool Integration Patterns​

Permission-Based Tool Execution​

Context Engineering Integration​

Workflow Coordination​

Session Orchestration​

Success Patterns​

Quality Gates​

Performance Characteristics​