ADR-001: Async TaskExecutor Refactoring - CODITECT Core Framework
Document: ADR-001-async-task-executor-refactoring
Version: 1.0.0
Purpose: Make TaskExecutor.execute() async to align with ProjectOrchestrator and enable Phase 1 autonomous agents
Audience: Engineering teams, architects, framework developers
Date Created: 2025-11-23
Status: ACCEPTED
Related ADRs: None (first ADR for coditect-core)
Related Documents:
- PROJECT-PLAN-EXECUTOR-REFACTORING.md
- ASYNC-EXECUTOR-STRATEGIC-PLAN.md
- docs/05-project-planning/PROJECT-PLAN.md (Phase 1 Message Bus)
Table of Contents
- Executive Summary
- Context and Problem Statement
- Decision Drivers
- Considered Options
- Decision Outcome
- Consequences
- Implementation Details
- Validation and Compliance
Executive Summary
Decision: Make TaskExecutor.execute() an async method and convert all 7 execution-related methods to async/await pattern, eliminating the async/sync boundary that currently blocks true async orchestration.
User Feedback Integration:
"I agree with your analysis. Please explicitly detail making
TaskExecutor.executeasync to align seamlessly withProjectOrchestrator.execute_task's async nature."
Why This Matters: This change is foundational for Phase 1 Message Bus autonomous agents. Without end-to-end async flow, agents cannot coordinate asynchronously, parallel task execution is impossible, and the system remains fundamentally human-in-the-loop.
Key Principle: Eliminate the asyncio.run() wrapper that creates new event loops for each task, enabling true async orchestration from ProjectOrchestrator → TaskExecutor → LlmFactory → BaseLlm providers.
Context and Problem Statement
The Challenge
The TaskExecutor refactoring project (PROJECT-PLAN-EXECUTOR-REFACTORING.md) is moving from subprocess-based execute_*.py scripts to direct LLM abstraction layer. However, the original plan left TaskExecutor.execute() as a sync method with an asyncio.run() wrapper:
Current Architecture (Problematic)
# ProjectOrchestrator.execute_task() - SYNC
def execute_task(self, task_id: str, agent: Optional[str] = None) -> ExecutionResult:
task = self.get_task(task_id)
result = self.executor.execute(task, agent=agent) # SYNC call
return result
# TaskExecutor.execute() - SYNC
def execute(self, task: AgentTask, agent: Optional[str] = None) -> ExecutionResult:
result = self._execute_via_llm(task, agent_config, result) # SYNC
return result
# TaskExecutor._execute_via_llm() - SYNC with asyncio.run() wrapper
def _execute_via_llm(self, task, agent_config, result):
# Uses asyncio.run() to wrap async LLM calls
if asyncio.get_event_loop().is_running():
output = await provider.generate_content_async(messages) # ASYNC
else:
output = asyncio.run(provider.generate_content_async(messages)) # SYNC → ASYNC
return result
Problems with This Architecture
- Event Loop Overhead:
asyncio.run()creates a new event loop for each task (~10-20ms overhead) - Blocks Async Orchestration: Cannot use
awaitinProjectOrchestrator.execute_task(), forcing sequential execution - Prevents Parallel Execution: 3 tasks take 6 seconds (sequential) instead of 2 seconds (parallel)
- Phase 1 Blocker: Message Bus autonomous agents require async coordination
Business Context
Phase 1 Message Bus Requirements:
- Async agent-to-agent task delegation
- Concurrent task queue processing
- Non-blocking orchestration (orchestrator remains responsive)
- Event-driven agent coordination
Performance Goals:
- Single task: 30-50% improvement (subprocess → direct LLM) ✅ Already achieved
- Parallel tasks: 3x improvement (sequential → concurrent) ⏸️ Blocked by sync executor
Strategic Alignment:
- User explicitly requested async alignment with
ProjectOrchestrator - Foundation for $100K+ Phase 1 autonomous agent implementation
- Enables future async workflows (batch processing, pipeline orchestration)
Risk of Not Changing:
- Phase 1 Message Bus becomes architecturally impossible
- Performance gains limited to single-task scenarios
- Technical debt accumulates (async/sync boundaries multiply)
Decision Drivers
Mandatory Requirements (Must-Have)
- User Alignment - User explicitly requested async executor to align with ProjectOrchestrator
- Phase 1 Foundation - Message Bus autonomous agents require end-to-end async
- Performance - Enable parallel task execution (3x speedup for concurrent tasks)
- Architecture Purity - Eliminate async/sync boundaries and event loop overhead
- Future-Proofing - Enable async workflows without future refactoring
Important Goals (Should-Have)
- Backward Compatibility - Maintain dual-mode execution (direct LLM + script fallback)
- Test Coverage - 90%+ coverage with
pytest-asyncio - Budget Efficiency - Minimize cost increase ($750 for 6 hours additional work)
- Timeline Impact - Keep project within 4-5 weeks
- Developer Experience - Clear async/await patterns throughout
Nice-to-Have
- Real-time Responsiveness - Orchestrator remains responsive during long-running tasks
- Observability - Easy to trace async execution flows
- Error Handling - Async exceptions properly propagated
Considered Options
Option 1: Keep Sync Executor with asyncio.run() Wrapper (REJECTED ❌)
Architecture:
def execute(self, task):
# Sync method
result = asyncio.run(provider.generate_content_async(messages))
return result
Pros:
- ✅ No signature changes (backward compatible)
- ✅ Minimal code changes (6-8 hours work)
- ✅ Simpler for sync callers
Cons:
- ❌ Creates new event loop per task (~10-20ms overhead)
- ❌ Blocks async orchestration (cannot use
await) - ❌ Prevents parallel task execution (3 tasks = 6s sequential)
- ❌ Makes Phase 1 Message Bus architecturally impossible
- ❌ User explicitly disagreed with this approach
Decision: REJECTED - User feedback and Phase 1 requirements make this non-viable.
Option 2: Hybrid Approach (Dual Sync/Async Methods) (REJECTED ❌)
Architecture:
def execute(self, task):
# Sync method for backward compatibility
return asyncio.run(self.execute_async(task))
async def execute_async(self, task):
# Async method for new callers
result = await self._execute_via_llm(task, agent_config, result)
return result
Pros:
- ✅ Backward compatible (sync method still exists)
- ✅ Enables async callers to use
execute_async() - ✅ Gradual migration path
Cons:
- ❌ API confusion (two methods doing same thing)
- ❌ Sync method still creates event loop overhead
- ❌ Doubles maintenance burden (two code paths)
- ❌ Doesn't fully solve async/sync boundary problem
Decision: REJECTED - Adds complexity without solving core problem.
Option 3: Make TaskExecutor.execute() Async (SELECTED ✅)
Architecture:
async def execute(self, task):
# Fully async method
result = await self._execute_via_llm(task, agent_config, result)
return result
async def _execute_via_llm(self, task, agent_config, result):
# No asyncio.run() wrapper needed
output = await provider.generate_content_async(messages)
return result
# All callers updated
async def execute_task(self, task_id):
result = await self.executor.execute(task) # Async call
return result
Pros:
- ✅ Eliminates async/sync boundary (no event loop overhead)
- ✅ Enables parallel execution (3 tasks = 2s concurrent vs 6s sequential)
- ✅ Foundation for Phase 1 (Message Bus can coordinate async agents)
- ✅ User-approved (explicit alignment with ProjectOrchestrator)
- ✅ Architecture purity (async all the way through)
- ✅ Future-proof (enables all async workflows)
Cons:
- ❌ Breaking change (all callers must use
await executor.execute()) - ❌ Test migration required (pytest-asyncio for all tests)
- ❌ +6 hours work (+$750 budget increase)
- ❌ +3 days timeline (4.5 weeks vs 4 weeks)
Decision: SELECTED ✅ - Pros massively outweigh cons. Strategic alignment with user feedback and Phase 1 requirements.
Decision Outcome
Chosen Option
Option 3: Make TaskExecutor.execute() Async ✅
Rationale:
- User Alignment: User explicitly requested async executor to align with ProjectOrchestrator
- Phase 1 Critical: Message Bus cannot be built without async executor
- Performance: 3x speedup for parallel tasks (6s → 2s for 3 concurrent tasks)
- ROI: $750 investment enables $100K+ Phase 1 implementation (10x+ ROI)
- Architecture: Eliminates technical debt before it accumulates
Method Signature Changes
7 Methods Converted to Async
TaskExecutor.execute()- Main entry pointTaskExecutor._execute_via_llm()- Direct LLM executionTaskExecutor._execute_api()- API execution modeProjectOrchestrator.execute_task()- Orchestrator integrationProjectOrchestrator.execute_tasks_parallel()- Batch execution (new)ProjectOrchestrator._execute_task_internal()- Internal helper- Test helper methods - All async with
@pytest.mark.asyncio
Before (Sync)
def execute(
self,
task: AgentTask,
agent: Optional[str] = None,
mode: Optional[str] = None
) -> ExecutionResult:
"""Execute a single task using specified agent."""
...
After (Async)
async def execute(
self,
task: AgentTask,
agent: Optional[str] = None,
mode: Optional[str] = None
) -> ExecutionResult:
"""
Execute a single task using specified agent.
Note:
This method is async to enable concurrent task execution
and seamless integration with async LLM providers.
"""
...
Consequences
Positive Consequences
-
End-to-End Async Flow ✅
- No
asyncio.run()wrappers - Single event loop throughout execution
- True async orchestration enabled
- No
-
Parallel Task Execution ✅
# 3 tasks execute concurrently
tasks = [task1, task2, task3]
results = await asyncio.gather(*[executor.execute(t) for t in tasks])
# Time: 2s (concurrent) vs 6s (sequential) = 3x speedup -
Phase 1 Foundation ✅
- Message Bus can coordinate async agents
- Agent-to-agent communication works
- Non-blocking orchestration enabled
-
Performance Gains ✅
- Single task: 30-50% improvement (already achieved)
- Parallel tasks: 3x improvement (new capability)
- No event loop overhead (~10-20ms saved per task)
-
Architecture Purity ✅
- Clean async/await pattern throughout
- No sync/async boundaries
- Future async workflows easy to add
Negative Consequences
-
Breaking Change ⚠️
- All callers must migrate to
await executor.execute() - Synchronous callers must wrap in
asyncio.run()externally - Mitigation: Comprehensive migration guide (docs/EXECUTOR-MIGRATION-GUIDE.md)
- All callers must migrate to
-
Test Migration ⚠️
- All tests need
@pytest.mark.asyncio - Test fixtures need async support
- Mitigation: Automated test migration script + 6 hours allocated
- All tests need
-
Budget Increase ⚠️
- Original: $10,000 (80 hours)
- Updated: $10,750 (86 hours)
- Delta: +$750 (+7.5%)
- Mitigation: ROI is 10x+ (enables $100K+ Phase 1)
-
Timeline Impact ⚠️
- Original: 4 weeks
- Updated: 4.5 weeks
- Delta: +3 days
- Mitigation: Concentrated in Phase 1B, no impact to Weeks 2-4
Risk Mitigation
| Risk | Mitigation Strategy |
|---|---|
| Async/await bugs | Comprehensive test suite + mypy type checking |
| pytest-asyncio issues | Use latest version (0.23+), test early in Phase 1B |
| Event loop conflicts | Use asyncio_mode=auto in pytest.ini |
| Parallel execution race conditions | Add asyncio.Lock() for shared state, thorough testing |
| Caller migration errors | Migration guide + example code + rollback plan |
Rollback Plan: If async refactoring causes critical issues:
- Revert
TaskExecutor.execute()to sync - Re-add
asyncio.run()wrapper in_execute_via_llm() - Keep script execution as fallback
- Defer async refactoring to Phase 2
Implementation Details
Phase 1B Updates (22 hours, up from 16 hours)
Original Phase 1B Tasks
- Add
_execute_via_llm()method - 8 hours - Add async/sync compatibility layer - 4 hours
- Write unit tests - 4 hours
Updated Phase 1B Tasks
- Make
_execute_via_llm()fully async - 2 hours (remove asyncio.run()) - Make
TaskExecutor.execute()async - 4 hours (signature change + caller updates) - Make
TaskExecutor._execute_api()async - 2 hours (await calls) - Make
ProjectOrchestrator.execute_task()async - 2 hours (signature change) - Update internal method calls with await - 4 hours (orchestrator methods)
- Write async unit tests with pytest-asyncio - 6 hours (test migration)
- Integration tests for async flow - 2 hours (end-to-end async validation)
Testing Strategy
Add to requirements-dev.txt
pytest-asyncio==0.23.0
pytest.ini Configuration
[pytest]
asyncio_mode = auto
Test Migration Example
Before (Sync)
def test_execute_task():
result = executor.execute(task)
assert result.status == ExecutionStatus.SUCCESS
After (Async)
@pytest.mark.asyncio
async def test_execute_task():
result = await executor.execute(task)
assert result.status == ExecutionStatus.SUCCESS
Parallel Execution Test
@pytest.mark.asyncio
async def test_parallel_execution():
tasks = [AgentTask(title=f"Task {i}") for i in range(3)]
start = time.time()
results = await asyncio.gather(*[executor.execute(task) for task in tasks])
end = time.time()
assert all(r.status == ExecutionStatus.SUCCESS for r in results)
assert end - start < 3.0 # Verify parallel execution (3 tasks in ~2s)
Performance Benchmarks
Baseline (Current Sync + asyncio.run())
Task 1: |---LLM API (2s)---|
Task 2: |---LLM API (2s)---|
Task 3: |---LLM API (2s)---|
Total: 6s (sequential)
Target (Fully Async)
Task 1: |---LLM API (2s)---|
Task 2: |---LLM API (2s)---|
Task 3: |---LLM API (2s)---|
Total: 2s (concurrent) = 3x speedup
Success Criteria
- ✅ Single task: 30%+ improvement (already validated)
- ✅ Parallel tasks (3): <3s total (3x vs sequential) ← NEW
- ✅ No asyncio.run() wrappers in production code ← NEW
- ✅ pytest-asyncio test suite operational ← NEW
Budget & Timeline Impact
Budget
- Original: $10,000 (80 hours)
- Updated: $10,750 (86 hours)
- Delta: +$750 (+7.5%)
- Justification: Enables $100K+ Phase 1 implementation (10x+ ROI)
Timeline
- Original: 4 weeks
- Updated: 4.5 weeks
- Delta: +3 days (concentrated in Phase 1B)
Week 1 (Updated)
- Days 1-2: Phase 1A (Foundation) - 16 hours
- Days 3-5.5: Phase 1B (Async Executor) - 22 hours (+6 hours)
Weeks 2-4: No changes (providers already async-compatible)
Validation and Compliance
Validation Checklist
Technical Validation
- All 7 methods converted to async
- All callers updated with
await - pytest-asyncio installed and configured
- All tests migrated to async
- Parallel execution test passes (<3s for 3 tasks)
- No asyncio.run() in production code
- Type hints updated (
async defsignatures)
Performance Validation
- Single task: ≥30% improvement (vs subprocess)
- Parallel tasks: ≥3x improvement (vs sequential)
- Event loop overhead eliminated
- Benchmarks documented in test-results/
Quality Validation
- Test coverage ≥90%
- Integration tests pass (all providers)
- No regression in backward compatibility (dual-mode still works)
- Documentation updated (migration guide, API docs)
Phase 1 Validation
- ProjectOrchestrator.execute_task() is async
- Message Bus can call executor asynchronously
- Agent-to-agent coordination works
- Foundation report created (docs/PHASE-1-FOUNDATION-VALIDATION.md)
Compliance Requirements
Backward Compatibility
- Dual-mode execution maintained (direct LLM + script fallback)
- Feature flag
use_direct_llmstill controls execution path - Script execution still works (deprecated but functional)
Migration Support
- Migration guide published: docs/EXECUTOR-MIGRATION-GUIDE.md
- Example code for async callers
- Rollback procedure documented
Documentation
- ADR created (this document)
- ASYNC-EXECUTOR-STRATEGIC-PLAN.md comprehensive analysis
- PROJECT-PLAN-EXECUTOR-REFACTORING.md updated with async section
- API documentation reflects async signatures
Links
Related Documents
Project Planning
- PROJECT-PLAN-EXECUTOR-REFACTORING.md - Complete refactoring plan
- ASYNC-EXECUTOR-STRATEGIC-PLAN.md - 62KB comprehensive async analysis
- ASYNC-EXECUTOR-INTEGRATION-SECTION.md - Integration instructions
Phase 1 Context
- docs/05-project-planning/PROJECT-PLAN.md - Overall CODITECT roadmap
- Phase 1 Message Bus requirements (autonomous agents)
Code Locations
orchestration/executor.py- TaskExecutor implementationorchestration/orchestrator.py- ProjectOrchestrator implementationllm_abstractions/- LLM provider implementations
Testing
tests/test_executor_dual_mode.py- Async executor teststests/integration/test_executor_anthropic.py- Integration testsbenchmarks/executor_performance.py- Performance benchmarks
External References
Best Practices
- Python asyncio documentation
- pytest-asyncio documentation
- FastAPI async patterns - Similar async architecture
Multi-Agent Orchestration
- docs/MULTI-AGENT-ARCHITECTURE-BEST-PRACTICES.md - Research on async patterns
Status History
| Date | Status | Author | Notes |
|---|---|---|---|
| 2025-11-23 | PROPOSED | Claude Analysis | Initial draft based on user feedback |
| 2025-11-23 | ACCEPTED | Hal Casteel | User approved async executor approach |
Approval
Decision Maker: Hal Casteel, CEO/CTO, AZ1.AI INC.
Decision Status: ✅ ACCEPTED
Date Approved: 2025-11-23
Conditions for Approval
- ✅ User explicitly requested async alignment
- ✅ Budget increase approved ($750 for 6 hours)
- ✅ Timeline impact acceptable (+3 days)
- ✅ Phase 1 foundation critical path
- ✅ ROI validated (10x+ return)
Implementation Authorization
- Proceed with Phase 1B async refactoring (22 hours)
- Budget: $10,750 (up from $10,000)
- Timeline: 4.5 weeks (up from 4 weeks)
- Next checkpoint: Phase 1B completion (end of Week 1)
Document Status: ✅ APPROVED Last Updated: 2025-11-23 Next Review: After Phase 1B completion Owner: Hal Casteel, CEO/CTO, AZ1.AI INC.