Etl Pipeline Creation
Design and implement Extract-Transform-Load pipeline with error handling, incremental loading, idempotency, and monitoring for batch data processing.
Complexity: Complex | Duration: 30m+ | Category: Devops
Tags: data-engineering etl pipeline batch-processing
Workflow Diagram
Steps
Step 1: Source analysis
Agent: data
engineer - Identify data sources (databases, APIs, files)
Step 2: Extract logic
Agent: data
engineer - Implement extraction with pagination, rate limiting
Step 3: Transform logic
Agent: data
engineer - Data cleaning, type conversion, business rules
Step 4: Load strategy
Agent: backend
architect - Choose full/incremental, upsert/replace
Step 5: Error handling
Agent: data
engineer - Implement retry logic, dead letter queue
Step 6: Idempotency
Agent: data
engineer - Ensure rerunnable without duplicates
Step 7: Testing
Agent: testing
specialist - Unit tests for each stage, integration tests
Step 8: Monitoring
Agent: data
engineer - Log progress, errors, row counts, duration
Usage
To execute this workflow:
/workflow devops/etl-pipeline-creation.workflow
Related Workflows
See other workflows in this category for related automation patterns.