Deployment Archeology Skill
Deployment Archeology Skill
How to Use This Skill
- Review the patterns and examples below
- Apply the relevant patterns to your implementation
- Follow the best practices outlined in this skill
Purpose: Find and restore previous successful deployment configurations by analyzing git history, Cloud Build logs, and Kubernetes deployments.
When to Use
Use this skill when:
- Current deployment failing and need to find what worked before
- Need to understand how a service was originally deployed
- Investigating deployment regressions
- Recovering from accidental configuration changes
Process
Step 1: Identify Current Deployment Date
# Get deployment creation timestamp from Kubernetes
kubectl get deployment <DEPLOYMENT_NAME> -n <NAMESPACE> -o jsonpath='{.metadata.creationTimestamp}'
Step 2: Search Cloud Build History
# List builds around the deployment date
gcloud builds list \
--filter="createTime>='YYYY-MM-DDT00:00:00Z' AND createTime<='YYYY-MM-DDT23:59:59Z'" \
--format="table(id,status,createTime)" \
--limit=50
Step 3: Analyze Successful Build
# Get build configuration from successful build closest to deployment time
gcloud builds describe <BUILD_ID> --format="yaml(steps,substitutions,options)"
Key things to extract:
- Dockerfile name (check
argsfor-fflag) - Machine type (
options.machineType) - Environment variables (
options.env) - Build steps and deployment method
Step 4: Search Git History
# Find commits around deployment date
git log --since="YYYY-MM-DD" --until="YYYY-MM-DD" --oneline --all
# Check if files were archived
git log --all --full-history -- <FILENAME>
Step 5: Restore Configuration
# Find archived files
find . -name "<FILENAME>" -type f
# Check git history for file content at specific date
git show <COMMIT>:<PATH>
# Restore from archive directory if needed
cp docs/99-archive/deployment-obsolete/<FILE> ./<FILE>
Example: Combined Service Recovery (Oct 18, 2025)
Problem: Dockerfile.combined failing to build
Investigation:
- Found deployment created:
2025-10-13T09:58:29Z - Found successful build:
6e95a4d9-2f19-456c-bba8-5a1ed7a8fdf7at09:50:07Z - Build used:
Dockerfile.local-test(not Dockerfile.combined!) - Machine:
E2_HIGHCPU_32withNODE_OPTIONS=--max_old_space_size=8192 - File archived in commit
04ef4b4todocs/99-archive/deployment-obsolete/
Recovery:
# Restore working Dockerfile
cp docs/99-archive/deployment-obsolete/Dockerfile.local-test ./
# Update cloudbuild config
# Change: Dockerfile.combined -> Dockerfile.local-test
# Change: N1_HIGHCPU_8 -> E2_HIGHCPU_32
# Add: NODE_OPTIONS=--max_old_space_size=8192
# Rebuild with proven config
gcloud builds submit --config cloudbuild-combined.yaml .
Automation Script
#!/bin/bash
# deployment-archeology.sh - Find previous successful build config
DEPLOYMENT=$1
NAMESPACE=${2:-default}
echo "=== Deployment Archeology ==="
# Step 1: Get deployment date
DEPLOY_DATE=$(kubectl get deployment $DEPLOYMENT -n $NAMESPACE -o jsonpath='{.metadata.creationTimestamp}')
SEARCH_DATE=$(date -d $DEPLOY_DATE '+%Y-%m-%d')
echo "Deployment created: $DEPLOY_DATE"
echo "Searching builds on: $SEARCH_DATE"
# Step 2: Find builds on that date
echo ""
echo "=== Cloud Build History ==="
gcloud builds list \
--filter="createTime>='${SEARCH_DATE}T00:00:00Z' AND createTime<='${SEARCH_DATE}T23:59:59Z'" \
--format="table(id,status,createTime)" \
--limit=20
# Step 3: Show git commits around that date
echo ""
echo "=== Git History ==="
git log --since="$SEARCH_DATE" --until="$(date -d "$SEARCH_DATE + 1 day" '+%Y-%m-%d')" --oneline --all | head -20
echo ""
echo "Next steps:"
echo "1. Identify successful build ID (STATUS=SUCCESS)"
echo "2. Run: gcloud builds describe <BUILD_ID> --format='yaml(steps,options)'"
echo "3. Check for archived files: find . -name 'Dockerfile*' | grep archive"
echo "4. Compare current config vs successful build config"
Tips
- Look for BUILD_ID vs SHORT_SHA: Manual builds use
$BUILD_ID, git triggers use$SHORT_SHA - Check machine type: Theia builds need high CPU (E2_HIGHCPU_32)
- Node memory: Webpack builds often need 8GB+ heap (
NODE_OPTIONS=--max_old_space_size=8192) - Archive directories: Check
docs/99-archive/andarchive/for old configs - Git submodules: May contain reference implementations
Common Gotchas
- ❌ Assuming current files match deployed version
- ❌ Not checking environment variables in Cloud Build options
- ❌ Forgetting to check for archived/moved files
- ❌ Using wrong Dockerfile (may have multiple variants)
- ❌ Missing build prerequisites (like pre-built
dist/directory)
Integration with Other Skills
- codebase-locator: Find all Dockerfile variants
- thoughts-locator: Find deployment session exports
- web-search-researcher: Research Cloud Build error messages
Multi-Context Window Support
This skill supports long-running deployment investigation tasks across multiple context windows using Claude 4.5's enhanced state management capabilities.
State Tracking
Checkpoint State (JSON):
{
"investigation_id": "archeology_20251129_150000",
"target_deployment": "coditect-combined",
"phase": "build_history_analyzed",
"deployment_date": "2025-10-13T09:58:29Z",
"successful_builds_found": [
{"build_id": "6e95a4d9-2f19-456c-bba8-5a1ed7a8fdf7", "date": "2025-10-13T09:50:07Z"},
{"build_id": "abc123-def456-789", "date": "2025-10-12T15:30:00Z"}
],
"config_differences": [],
"archived_files_located": ["docs/99-archive/deployment-obsolete/Dockerfile.local-test"],
"recovery_plan_created": false,
"token_usage": 8200,
"created_at": "2025-11-29T15:00:00Z"
}
Progress Notes (Markdown):
# Deployment Archeology Progress - 2025-11-29
## Completed
- ✅ Found deployment timestamp: 2025-10-13T09:58:29Z
- ✅ Searched Cloud Build history around that date
- ✅ Found 2 successful builds (6e95a4d9, abc123)
- ✅ Located archived Dockerfile.local-test
## In Progress
- Analyzing build configuration differences
- Creating recovery plan
## Key Findings
- Successful build used Dockerfile.local-test (not Dockerfile.combined)
- Machine type: E2_HIGHCPU_32 (not N1_HIGHCPU_8)
- NODE_OPTIONS=--max_old_space_size=8192
## Next Actions
- Compare current config vs successful build
- Create step-by-step recovery plan
- Test recovery with dry-run build
Session Recovery
When starting a fresh context window after deployment investigation:
- Load Checkpoint State: Read
.coditect/checkpoints/deployment-archeology-latest.json - Review Progress Notes: Check
archeology-progress.mdfor findings - Verify Build History: Confirm successful builds identified
- Resume Analysis: Continue with config comparison or recovery plan
- Apply Recovery: Execute recovery plan if ready
Recovery Commands:
# 1. Check latest checkpoint
cat .coditect/checkpoints/deployment-archeology-latest.json | jq '.successful_builds_found'
# 2. Review progress
tail -30 archeology-progress.md
# 3. Verify successful build details
gcloud builds describe 6e95a4d9-2f19-456c-bba8-5a1ed7a8fdf7 --format=yaml
# 4. Check archived files
ls -la docs/99-archive/deployment-obsolete/
# 5. Compare configs
diff current-config.yaml successful-build-config.yaml
State Management Best Practices
Checkpoint Files (JSON Schema):
- Store in
.coditect/checkpoints/deployment-archeology-{target}.json - Track successful builds found with timestamps
- Record config differences for comparison
- Include archived file locations for quick access
Progress Tracking (Markdown Narrative):
- Maintain
archeology-progress.mdwith investigation timeline - Document key findings and insights
- Note configuration differences discovered
- List recovery steps with commands
Git Integration:
- Save recovery plan to
.coditect/reports/recovery-plan-{target}-{date}.md - Tag investigations:
git tag archeology-{target}-{date}
Progress Checkpoints
Natural Breaking Points:
- After deployment timestamp identified
- After Cloud Build history searched
- After successful build configurations analyzed
- After recovery plan created
- After recovery plan tested/executed
Checkpoint Creation Pattern:
# Automatic checkpoint creation after each phase
if phase in ["deployment_found", "builds_analyzed", "recovery_plan_created"]:
create_checkpoint({
"target_deployment": deployment_name,
"phase": phase,
"successful_builds_found": builds_list,
"config_differences": diffs_list,
"tokens": current_token_usage
})
Example: Multi-Context Investigation
Context Window 1: Discovery Phase
{
"checkpoint_id": "ckpt_archeology_part1",
"phase": "builds_found",
"deployment_date": "2025-10-13T09:58:29Z",
"successful_builds_found": 2,
"archived_files_located": 1,
"next_action": "Analyze build configurations",
"token_usage": 8200
}
Context Window 2: Analysis & Recovery Plan
# Resume from checkpoint
cat .coditect/checkpoints/ckpt_archeology_part1.json
# Continue with config analysis
# (Context restored in 2 minutes vs 12 minutes from scratch)
{
"checkpoint_id": "ckpt_archeology_complete",
"phase": "recovery_plan_ready",
"config_differences_identified": 3,
"recovery_plan_created": true,
"plan_path": ".coditect/reports/recovery-plan-combined-2025-11-29.md",
"token_usage": 6500
}
Token Savings: 8200 (first context) + 6500 (second context) = 14700 total vs. 25000 without checkpoint = 41% reduction
Reference: See docs/CLAUDE-4.5-BEST-PRACTICES.md for complete multi-context window workflow guidance.
Success Output
When this skill completes successfully, output:
✅ SKILL COMPLETE: deployment-archeology
Completed:
- [x] Deployment timestamp identified from Kubernetes
- [x] Cloud Build history searched for successful builds
- [x] Successful build configuration analyzed
- [x] Git history searched for archived files
- [x] Configuration files restored
- [x] Recovery plan created and validated
Outputs:
- Deployment date: 2025-10-13T09:58:29Z
- Successful build ID: 6e95a4d9-2f19-456c-bba8-5a1ed7a8fdf7
- Restored files: Dockerfile.local-test, cloudbuild config
- Recovery plan: .coditect/reports/recovery-plan-combined-2025-11-29.md
- Configuration differences documented
Completion Checklist
Before marking this skill as complete, verify:
- Deployment creation timestamp retrieved from Kubernetes
- Cloud Build history searched around deployment date
- At least one successful build identified
- Build configuration extracted (Dockerfile, machine type, env vars)
- Git history searched for commits around deployment date
- Archived files located (check docs/99-archive/)
- Configuration differences documented
- Recovery plan created with step-by-step commands
- Checkpoint saved to
.coditect/checkpoints/
Failure Indicators
This skill has FAILED if:
- ❌ Cannot retrieve deployment timestamp from Kubernetes
- ❌ No Cloud Build history found for deployment date range
- ❌ No successful builds found (all builds failed or not found)
- ❌ Build configuration incomplete (missing Dockerfile, machine type, etc.)
- ❌ Git history search returns no relevant commits
- ❌ Cannot locate archived configuration files
- ❌ Configuration differences not identified
- ❌ Recovery plan missing or incomplete
When NOT to Use
Do NOT use this skill when:
- Deployment is currently working (no investigation needed)
- Issue is not deployment-related (use
error-debugging-patternsinstead) - You have the exact working configuration already
- Problem is with application code, not deployment config
- Building a new service from scratch (use
deployment-automationinstead) - Only need to check current deployment status (use
kubectl get deployment) - Working with local development environment (not production/staging)
Use alternatives:
- error-debugging-patterns - For application errors/bugs
- deployment-automation - For new deployments
- kubernetes-troubleshooting - For current deployment issues
- git-workflow-automation - For general git history searches
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Assuming current files match deployed | Files may have been archived/modified | Always check Cloud Build logs for actual config used |
| Ignoring environment variables | Missing critical config (NODE_OPTIONS, etc.) | Check Cloud Build options.env field |
| Not checking for archived files | Miss working configurations | Search docs/99-archive/, git log --all --full-history |
| Using wrong Dockerfile | Multiple variants exist | Verify from Cloud Build args -f flag |
| Forgetting build prerequisites | Deployment may need pre-built artifacts | Check if dist/ or other artifacts required |
| Single date search | Build may be day before/after | Search ±1 day from deployment date |
| Ignoring machine type | Low-resource machine causes failures | Check Cloud Build options.machineType |
| Not documenting recovery plan | Knowledge lost if context cleared | Always create recovery-plan-{target}-{date}.md |
Principles
This skill embodies these CODITECT principles:
- #1 Search Before Create - Find what worked before instead of rebuilding from scratch
- #2 Evidence-Based Recovery - Use actual Cloud Build logs, not assumptions
- #3 Historical Context Matters - Git history and deployment dates provide critical clues
- #5 Eliminate Ambiguity - Document exact configuration differences discovered
- #6 Clear, Understandable - Recovery plan with step-by-step commands
- #7 Systematic Investigation - Follow 5-step process (timestamp → builds → analyze → git → restore)
- #8 No Assumptions - Verify every configuration detail from actual logs
Reference: CODITECT-STANDARD-AUTOMATION.md