Agent Skills Framework Extension

MLOps Patterns Skill

When to Use This Skill

Use this skill when implementing mlops patterns patterns in your codebase.

How to Use This Skill

Review the patterns and examples below
Apply the relevant patterns to your implementation
Follow the best practices outlined in this skill

Production ML operations with automated pipelines, serving, and monitoring.

Core Capabilities

Training Pipelines - Automated, reproducible model training
Model Serving - High-performance inference endpoints
Feature Stores - Centralized feature management
Model Monitoring - Drift detection, performance tracking
Experiment Tracking - Versioning, metrics, artifacts

Training Pipeline

# mlops/training_pipeline.py
from dataclasses import dataclass
from typing import Dict, Any, Optional
import mlflow
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score

@dataclass
class TrainingConfig:
    experiment_name: str
    model_name: str
    n_estimators: int = 100
    max_depth: int = 10
    test_size: float = 0.2
    random_state: int = 42

class MLTrainingPipeline:
    def __init__(self, config: TrainingConfig):
        self.config = config
        mlflow.set_experiment(config.experiment_name)

    def run(self, data: pd.DataFrame, target_col: str) -> str:
        with mlflow.start_run() as run:
            # Log parameters
            mlflow.log_params({
                'n_estimators': self.config.n_estimators,
                'max_depth': self.config.max_depth,
                'test_size': self.config.test_size
            })

            # Split data
            X = data.drop(columns=[target_col])
            y = data[target_col]

            X_train, X_test, y_train, y_test = train_test_split(
                X, y,
                test_size=self.config.test_size,
                random_state=self.config.random_state
            )

            # Train model
            model = RandomForestClassifier(
                n_estimators=self.config.n_estimators,
                max_depth=self.config.max_depth,
                random_state=self.config.random_state
            )

            model.fit(X_train, y_train)

            # Evaluate
            y_pred = model.predict(X_test)

            metrics = {
                'accuracy': accuracy_score(y_test, y_pred),
                'precision': precision_score(y_test, y_pred, average='weighted'),
                'recall': recall_score(y_test, y_pred, average='weighted')
            }

            # Log metrics
            mlflow.log_metrics(metrics)

            # Log model
            mlflow.sklearn.log_model(
                model,
                "model",
                registered_model_name=self.config.model_name
            )

            return run.info.run_id

Model Serving

# mlops/model_serving.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import mlflow.pyfunc
import numpy as np
from typing import List

app = FastAPI()

class PredictionRequest(BaseModel):
    features: List[float]

class PredictionResponse(BaseModel):
    prediction: float
    model_version: str

class ModelServer:
    def __init__(self, model_uri: str):
        self.model = mlflow.pyfunc.load_model(model_uri)
        self.model_version = model_uri.split('/')[-1]

    async def predict(self, features: List[float]) -> PredictionResponse:
        input_data = np.array([features])
        prediction = self.model.predict(input_data)[0]

        return PredictionResponse(
            prediction=float(prediction),
            model_version=self.model_version
        )

@app.post("/predict", response_model=PredictionResponse)
async def predict(request: PredictionRequest):
    try:
        return await model_server.predict(request.features)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Feature Store

# mlops/feature_store.py
from feast import FeatureStore, Entity, FeatureView, Field
from feast.types import Float32, Int64
from datetime import timedelta

# Define entities
user_entity = Entity(
    name="user_id",
    join_keys=["user_id"],
    description="User identifier"
)

# Define feature view
user_features = FeatureView(
    name="user_activity_features",
    entities=[user_entity],
    ttl=timedelta(days=1),
    schema=[
        Field(name="total_purchases", dtype=Int64),
        Field(name="avg_purchase_amount", dtype=Float32),
        Field(name="days_since_last_purchase", dtype=Int64)
    ],
    online=True,
    source=...  # Data source configuration
)

# Initialize feature store
fs = FeatureStore(repo_path=".")

# Get features for inference
features = fs.get_online_features(
    features=[
        "user_activity_features:total_purchases",
        "user_activity_features:avg_purchase_amount"
    ],
    entity_rows=[{"user_id": "user_123"}]
).to_dict()

Model Monitoring

# mlops/model_monitoring.py
from dataclasses import dataclass
from typing import Dict, List
import numpy as np
from scipy.stats import ks_2samp

@dataclass
class DriftReport:
    feature: str
    drift_detected: bool
    p_value: float
    threshold: float

class ModelMonitor:
    def __init__(self, reference_data: np.ndarray):
        self.reference_data = reference_data

    def detect_drift(
        self,
        current_data: np.ndarray,
        threshold: float = 0.05
    ) -> List[DriftReport]:
        reports = []

        for i in range(current_data.shape[1]):
            reference_feature = self.reference_data[:, i]
            current_feature = current_data[:, i]

            # Kolmogorov-Smirnov test
            statistic, p_value = ks_2samp(reference_feature, current_feature)

            drift_detected = p_value < threshold

            reports.append(DriftReport(
                feature=f"feature_{i}",
                drift_detected=drift_detected,
                p_value=p_value,
                threshold=threshold
            ))

        return reports

Usage Examples

Training Pipeline

Apply mlops-patterns skill to create automated ML training pipeline with experiment tracking

Model Serving

Apply mlops-patterns skill to deploy model serving endpoint with versioning

Drift Detection

Apply mlops-patterns skill to implement model monitoring with drift detection

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: mlops-patterns

Completed:
- [x] ML training pipeline implemented with MLflow tracking
- [x] Model serving endpoint deployed with versioning
- [x] Feature store configured with online/offline access
- [x] Model monitoring enabled with drift detection
- [x] Experiment tracking configured with metrics and artifacts

Outputs:
- mlops/training_pipeline.py (complete pipeline implementation)
- mlops/model_serving.py (FastAPI serving endpoint)
- mlops/feature_store.py (Feast feature definitions)
- mlops/model_monitoring.py (drift detection system)
- MLflow experiments visible at http://localhost:5000

Completion Checklist

Before marking this skill as complete, verify:

Training pipeline runs successfully and logs to MLflow
Model serving endpoint returns predictions correctly
Feature store serves features with <100ms latency
Model monitoring detects drift when data distribution changes
All Python dependencies installed (mlflow, scikit-learn, feast, fastapi)
Environment variables configured (MLFLOW_TRACKING_URI, etc.)
Integration tests pass for end-to-end pipeline
Documentation includes setup instructions

Failure Indicators

This skill has FAILED if:

❌ MLflow tracking server not accessible or experiments not logged
❌ Model serving returns HTTP 500 or incorrect predictions
❌ Feature store queries fail or timeout
❌ Model monitoring does not detect intentional drift in test data
❌ Training pipeline fails with unhandled exceptions
❌ Dependencies missing or version conflicts prevent execution
❌ No model artifacts saved to MLflow artifact store

When NOT to Use

Do NOT use mlops-patterns when:

Building simple model prototypes without deployment requirements (use basic scikit-learn directly)
Working with non-ML data pipelines (use data-engineering-patterns instead)
Implementing batch inference only (simpler patterns may suffice)
Project lacks resources for MLflow/Feast infrastructure (consider lightweight alternatives)
Model training is infrequent (<1x per month) and manual deployment is acceptable
Team unfamiliar with MLOps concepts (start with ai-ml-fundamentals skill first)

Anti-Patterns (Avoid)

Anti-Pattern	Problem	Solution
Training without experiment tracking	Loss of reproducibility, can't compare models	Always use MLflow tracking for all experiments
Hardcoded model paths	Breaks versioning, deployment issues	Use MLflow model registry with semantic versioning
Feature engineering in serving code	Training/serving skew, bugs	Centralize features in feature store (Feast)
No model monitoring	Silent model degradation	Implement drift detection with baseline data
Manual model deployment	Slow, error-prone	Use automated CI/CD with model validation gates
Large model files in Git	Repository bloat, slow clones	Store artifacts in MLflow artifact store (S3/GCS)
Ignoring data versioning	Can't reproduce results	Version datasets with DVC or Feast

Principles

This skill embodies:

#5 Eliminate Ambiguity - Clear separation of training, serving, and monitoring concerns
#6 Clear, Understandable, Explainable - Experiment tracking makes all decisions auditable
#8 No Assumptions - Explicit model versioning and validation before deployment
#9 Quality Over Speed - Proper MLOps infrastructure prevents production failures
#12 Separation of Concerns - Feature store decouples feature engineering from serving logic

Full Principles: CODITECT-STANDARD-AUTOMATION.md

Integration Points

cicd-automation-patterns - ML pipeline automation
data-engineering-patterns - Feature engineering
cloud-infrastructure-patterns - Scalable serving

When to Use This Skill​

How to Use This Skill​

Core Capabilities​

Training Pipeline​

Model Serving​

Feature Store​

Model Monitoring​

Usage Examples​

Training Pipeline​

Model Serving​

Drift Detection​

Success Output​

Completion Checklist​

Failure Indicators​

When NOT to Use​

Anti-Patterns (Avoid)​

Principles​

Integration Points​

When to Use This Skill

How to Use This Skill

Core Capabilities

Training Pipeline

Model Serving

Feature Store

Model Monitoring

Usage Examples

Training Pipeline

Model Serving

Drift Detection

Success Output

Completion Checklist

Failure Indicators

When NOT to Use

Anti-Patterns (Avoid)

Principles

Integration Points