Stale Feature Flag Detector
Purpose
- Scans all active feature flags weekly to identify staleness conditions
- Detects flags past their configured TTL (time-to-live)
- Identifies flags at 100% rollout for more than 14 days (should be permanent feature or removed)
- Identifies flags at 0% rollout for more than 30 days (legacy flags, likely forgotten)
- Detects flags with missing owner assignment (orphaned flags)
- Auto-creates cleanup tickets with recommendations for flag disposition
- Maintains flag hygiene and prevents technical debt accumulation
Trigger
Event Type: Scheduled Cron
Schedule: Weekly, Monday 9:00 AM UTC
Blocking: No (non-blocking background job)
Timeout: 60 seconds
Frequency: Once per week (configurable to daily if needed)
Behavior
When Triggered
- Connects to feature flag management system (Unleash, LaunchDarkly, or custom)
- Fetches all flags with metadata (creation date, last modified, TTL, current rollout %, owner)
- Evaluates each flag against staleness criteria:
- Flag age > TTL (if TTL is set)
- Flag at 100% rollout for > 14 days
- Flag at 0% rollout for > 30 days
- Flag missing owner field or owner is inactive
- For each stale flag detected:
- Calculates staleness severity (1-3: low, medium, high)
- Generates recommendation (promote to permanent feature, remove, or assign owner)
- Creates cleanup ticket with stale flag metadata
- Batches tickets by recommendation type
- Sends summary report to #flag-hygiene Slack channel with counts by category
Configuration
# .coditect/config/stale-flag-detector.json
{
"enabled": true,
"schedule": "0 9 ? * MON",
"timeout_seconds": 60,
"feature_flag_provider": "unleash",
"feature_flag_api_url": "https://flags.api.internal/",
"staleness_criteria": {
"ttl_exceeded": true,
"max_rollout_days": 14,
"min_rollout_days": 30,
"require_owner": true,
"owner_inactivity_threshold_days": 30
},
"ticket_creation": {
"enabled": true,
"project_key": "INFRA",
"label": "flag-cleanup",
"priority": "LOW",
"assignee_team": "platform"
},
"notifications": {
"slack_channel": "#flag-hygiene",
"include_details": true,
"include_metrics": true
},
"exclusions": {
"flag_name_patterns": [
"internal_*",
"experiment_*"
]
}
}
Integration
Integrates with:
- Feature flag management platform (Unleash, LaunchDarkly, custom API)
- Issue tracking system (Jira, GitHub Issues) for ticket creation
- Slack API for notifications
- Internal database for historical flag metadata
Output
Ticket Fields:
- Title:
[FLAG CLEANUP] {flag_name} - {recommendation_type} - Description: Markdown report with:
- Flag name, creation date, last modified date
- Current rollout percentage and audience
- TTL status (if applicable)
- Owner name and last activity
- Recommendation with disposal options
- Estimated cleanup time
- Labels:
flag-cleanup,{recommendation_type}(e.g.,promote-to-permanent,remove-flag,orphaned-flag) - Priority: LOW (MEDIUM if S1/S2 system affected)
- Assignee: Flag owner or Platform team
Slack Notification Format:
Weekly Flag Hygiene Report (Mon 9am)
Total flags scanned: 487
Stale flags detected: 23
- Past TTL: 8
- At 100% > 14d: 7
- At 0% > 30d: 5
- Missing owner: 3
Tickets created: 23
Action required: Review and prioritize cleanup tasks
Top candidates for removal: {flag_names}
Failure Handling
| Failure Scenario | Handling |
|---|---|
| Feature flag API unreachable | Retry with exponential backoff (3 attempts), skip run if all fail, alert #platform-oncall |
| Issue tracker connection fails | Log error, defer ticket creation, send alert to team lead |
| Slack notification fails | Log error, continue (non-fatal) |
| No stale flags detected | Send success notification with "0 issues found" |
| Configuration invalid | Fail fast, alert to #platform-oncall with error details |
Retry Logic: 3 retries with exponential backoff (2s, 5s, 10s) Alert Channel: #platform-oncall for failures
Related Hooks
| Hook | Relationship | Purpose |
|---|---|---|
| feature-flag-deployment-validator | Complementary | Validates flags during deployment |
| flag-rollout-monitor | Upstream | Monitors active flag rollouts in real-time |
| feature-flag-cleanup-scheduler | Downstream | Executes actual cleanup based on detector findings |
| incident-flag-correlation | Related | Correlates flags with incident patterns |
Principles
- Flag Hygiene First: Proactive detection prevents debt from accumulating
- Actionable Output: Every detection includes clear recommendation for remediation
- Minimal False Positives: Strict criteria and exclusion patterns to avoid noise
- Non-Blocking: Background job never blocks critical deployments or services
- Transparent Reporting: Summary reports visible to entire team for accountability
- Configuration-Driven: All criteria and thresholds configurable per environment
- Gradual Enforcement: Starts with detection and notifications before automation enforcement
- Audit Trail: All flag state changes logged for compliance and root cause analysis