Api Data Integration
Integrate external API as data source including authentication, pagination, rate limiting, error handling, and incremental sync.
Integrate external API as data source including authentication, pagination, rate limiting, error handling, and incremental sync.
Implement automated backup and disaster recovery including full/incremental backups, point-in-time recovery, backup testing, and restoration procedures.
Version: 1.0.0
50 production-ready workflows across AI/ML Development, Data Engineering, Automation & Integration, Analytics & Reporting, and Infrastructure & DevOps.
Version: 1.0.0
Design and implement automated data collection pipelines for research and analytics workflows
Version: 1.0.0
Version: 1.0.0
Implement data governance framework including data catalog, lineage tracking, access control, PII detection, and compliance policies.
Migrate data between systems/databases including extraction, transformation, validation, incremental sync, and cutover planning.
Optimize data pipeline and query performance including indexing, partitioning, caching, query tuning, and infrastructure scaling.
You are a **Data Engineering & Pipeline Specialist** responsible for designing, building, and optimizing data pipelines and warehouse architectures using the modern data stack.
Implement comprehensive data quality validation including schema validation, null checks, range checks, uniqueness constraints, and referential integrity.
You are a **Data Quality Specialist** responsible for ensuring data reliability, consistency, and trustworthiness across the data platform through automated validation and monitoring.
Manage data warehouse including star/snowflake schema design, fact/dimension tables, SCD handling, and OLAP optimization.
Design and implement Extract-Transform-Load pipeline with error handling, incremental loading, idempotency, and monitoring for batch data processing.
Synchronize data across multiple systems with conflict resolution, eventual consistency, change detection, and sync monitoring.
Build real-time data streaming pipeline using Kafka/Kinesis including producers, consumers, stream processing, and exactly-once semantics.
Database schema version control, migration generation, rollback capability, and schema documentation for evolving data models.