Skip to main content

Etl Pipeline Creation

Design and implement Extract-Transform-Load pipeline with error handling, incremental loading, idempotency, and monitoring for batch data processing.

Complexity: Complex | Duration: 30m+ | Category: Devops

Tags: data-engineering etl pipeline batch-processing

Workflow Diagram

Steps

Step 1: Source analysis

Agent: data

engineer - Identify data sources (databases, APIs, files)

Step 2: Extract logic

Agent: data

engineer - Implement extraction with pagination, rate limiting

Step 3: Transform logic

Agent: data

engineer - Data cleaning, type conversion, business rules

Step 4: Load strategy

Agent: backend

architect - Choose full/incremental, upsert/replace

Step 5: Error handling

Agent: data

engineer - Implement retry logic, dead letter queue

Step 6: Idempotency

Agent: data

engineer - Ensure rerunnable without duplicates

Step 7: Testing

Agent: testing

specialist - Unit tests for each stage, integration tests

Step 8: Monitoring

Agent: data

engineer - Log progress, errors, row counts, duration

Usage

To execute this workflow:

/workflow devops/etl-pipeline-creation.workflow

See other workflows in this category for related automation patterns.