Secrets Detection Skill
Secrets Detection Skill
How to Use This Skill
- Review the patterns and examples below
- Apply the relevant patterns to your implementation
- Follow the best practices outlined in this skill
Production-ready secrets detection skill for identifying and preventing credential exposure in source code, configuration files, and git history. Implements defense-in-depth with multiple detection engines.
When to Use This Skill
Use secrets-detection when:
- Setting up pre-commit hooks for new repositories
- Auditing existing codebases for exposed credentials
- Investigating potential security incidents
- Preparing for compliance audits (SOC 2, PCI-DSS)
- Onboarding third-party code or acquired repositories
- Validating .gitignore and secrets management
Don't use secrets-detection when:
- Full OWASP security audit needed (use
security-auditskill) - Dependency vulnerability scanning (use
dependency-securityskill) - Runtime secrets management (use cloud secret managers)
- Encryption key generation or rotation
Detection Coverage
Secret Types Detected
| Category | Examples | Detection Rate |
|---|---|---|
| API Keys | AWS, GCP, Azure, Stripe, SendGrid | 98% |
| OAuth Tokens | GitHub, GitLab, Slack, Discord | 96% |
| Database Credentials | PostgreSQL, MySQL, MongoDB URIs | 95% |
| Private Keys | RSA, SSH, PGP, SSL certificates | 99% |
| Cloud Credentials | Service accounts, access keys | 97% |
| JWT Tokens | Bearer tokens, refresh tokens | 92% |
| Passwords | Hardcoded, config files, .env | 90% |
| Webhooks | Slack, Discord, custom endpoints | 94% |
File Types Scanned
- Source code:
.py,.js,.ts,.go,.rs,.java - Configuration:
.yml,.yaml,.json,.toml,.ini - Environment:
.env,.env.*,docker-compose.yml - Infrastructure:
.tf,*.tfvars,cloudformation.yml - Documentation:
.md(for accidentally pasted secrets)
Instructions
Phase 1: Tool Setup
Objective: Install and configure detection tools.
-
Install Gitleaks (primary scanner):
# macOS
brew install gitleaks
# Linux
wget https://github.com/gitleaks/gitleaks/releases/latest/download/gitleaks_linux_x64.tar.gz
tar -xzf gitleaks_linux_x64.tar.gz
sudo mv gitleaks /usr/local/bin/
# Verify
gitleaks version -
Install TruffleHog (git history scanner):
# Python
pip install trufflehog
# Docker (recommended for CI)
docker pull trufflesecurity/trufflehog:latest
# Verify
trufflehog --version -
Install detect-secrets (baseline management):
pip install detect-secrets
# Verify
detect-secrets --version
Phase 2: Baseline Scan
Objective: Establish current state and identify existing secrets.
-
Run full repository scan:
# Gitleaks scan (current state)
gitleaks detect --source . --report-format json --report-path gitleaks-report.json
# TruffleHog scan (including git history)
trufflehog git file://. --json > trufflehog-report.json
# detect-secrets baseline
detect-secrets scan --all-files > .secrets.baseline -
Review findings:
# Count findings by severity
cat gitleaks-report.json | jq 'group_by(.RuleID) | map({rule: .[0].RuleID, count: length})' -
Classify findings:
- True Positives: Actual secrets requiring rotation
- False Positives: Test data, placeholders, examples
- Already Rotated: Historical secrets, verify rotation
Phase 3: Remediation
Objective: Remove secrets and prevent future exposure.
-
For active secrets:
# Step 1: Rotate the secret (in cloud console/service)
# Step 2: Update application to use secret manager
# Step 3: Remove from code
# Step 4: Consider git history cleanup (optional) -
Update .gitignore:
# Secrets and credentials
.env
.env.*
!.env.example
*.pem
*.key
*credentials*.json
*secret*.json
.secrets.baseline -
Git history cleanup (if required):
# WARNING: Rewrites history, coordinate with team
# Use BFG Repo-Cleaner for large repos
bfg --delete-files "*.pem"
bfg --replace-text passwords.txt
# Or git filter-branch for specific files
git filter-branch --force --index-filter \
'git rm --cached --ignore-unmatch path/to/secret.json' \
--prune-empty --tag-name-filter cat -- --all
Phase 4: Prevention Setup
Objective: Implement pre-commit hooks and CI integration.
-
Configure pre-commit hook:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/gitleaks/gitleaks
rev: v8.18.0
hooks:
- id: gitleaks
- repo: https://github.com/Yelp/detect-secrets
rev: v1.4.0
hooks:
- id: detect-secrets
args: ['--baseline', '.secrets.baseline'] -
Install hooks:
pre-commit install
pre-commit run --all-files # Test on existing code -
CI/CD integration:
# GitHub Actions
secrets-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for TruffleHog
- name: Gitleaks Scan
uses: gitleaks/gitleaks-action@v2
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: TruffleHog Scan
uses: trufflesecurity/trufflehog@main
with:
extra_args: --only-verified
Examples
Example 1: AWS Credentials Detection
Context: Scanning Python project for AWS access keys
Scan Command:
gitleaks detect --source . -v
Finding:
{
"Description": "AWS Access Key",
"StartLine": 15,
"EndLine": 15,
"StartColumn": 1,
"EndColumn": 40,
"Match": "AKIA...",
"Secret": "AKIAIOSFODNN7EXAMPLE",
"File": "config/settings.py",
"Commit": "abc123...",
"Entropy": 3.5,
"RuleID": "aws-access-key-id"
}
Remediation:
# BEFORE (insecure)
AWS_ACCESS_KEY = "AKIAIOSFODNN7EXAMPLE"
AWS_SECRET_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
# AFTER (secure)
import boto3
from botocore.config import Config
# Use IAM roles (preferred) or environment variables
# boto3 automatically checks:
# 1. Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
# 2. Shared credentials file (~/.aws/credentials)
# 3. IAM role for EC2/ECS/Lambda
client = boto3.client('s3') # Credentials resolved automatically
Example 2: Database URI with Embedded Password
Context: Docker Compose configuration
Finding:
# docker-compose.yml (INSECURE)
services:
app:
environment:
- DATABASE_URL=postgresql://admin:SuperSecret123!@db:5432/myapp
Remediation:
# docker-compose.yml (SECURE)
services:
app:
environment:
- DATABASE_URL=postgresql://${DB_USER}:${DB_PASSWORD}@db:5432/myapp
env_file:
- .env # Not committed to git
# .env (in .gitignore)
DB_USER=admin
DB_PASSWORD=SuperSecret123!
# .env.example (committed, for documentation)
DB_USER=your_username
DB_PASSWORD=your_password
Example 3: Git History Secret Exposure
Context: API key committed 6 months ago, since removed from code
Scan Command:
trufflehog git file://. --since-commit=HEAD~100
Finding:
{
"SourceMetadata": {
"Data": {
"Git": {
"commit": "a1b2c3d4...",
"file": "src/api/client.ts",
"line": 12,
"timestamp": "2024-06-15T10:30:00Z"
}
}
},
"DetectorName": "Stripe",
"Verified": true,
"Raw": "sk_live_51..."
}
Remediation:
- Immediate: Rotate the Stripe API key in dashboard
- Short-term: Update application to use Stripe secret manager
- Optional: Clean git history with BFG if required by compliance
Gitleaks Configuration
Custom Rules (.gitleaks.toml)
title = "Custom Gitleaks Config"
[allowlist]
description = "Global allowlist"
paths = [
'''(^|/)\.secrets\.baseline$''',
'''(^|/)test[s]?/''',
'''(^|/)__mocks__/''',
'''\.example$''',
]
[[rules]]
id = "custom-internal-api-key"
description = "Internal API Key Pattern"
regex = '''INTERNAL_API_KEY[=:]\s*['"]?([A-Za-z0-9_-]{32,})['"]?'''
entropy = 3.5
secretGroup = 1
tags = ["internal", "api-key"]
[[rules]]
id = "custom-jwt-token"
description = "JWT Token Detection"
regex = '''eyJ[A-Za-z0-9_-]*\.eyJ[A-Za-z0-9_-]*\.[A-Za-z0-9_-]*'''
tags = ["jwt", "token"]
Allowlist Management
# Allowlist specific files
[[rules.allowlist]]
paths = [
'''tests/fixtures/''',
'''docs/examples/''',
]
# Allowlist specific patterns (false positives)
[[rules.allowlist]]
regexes = [
'''EXAMPLE_KEY''',
'''PLACEHOLDER''',
'''sk_test_''', # Stripe test keys are safe
]
# Allowlist specific commits (already remediated)
[[rules.allowlist]]
commits = [
"abc123def456...",
]
Integration
Related Components
- Skill:
security-audit- Full OWASP security scanning - Skill:
dependency-security- Vulnerable package detection - Hook:
pre-commit-hook- Git pre-commit integration - Agent:
security-specialist- Security review orchestration
Secret Manager Integration
# AWS Secrets Manager
import boto3
def get_secret(secret_name: str) -> str:
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_name)
return response['SecretString']
# GCP Secret Manager
from google.cloud import secretmanager
def get_gcp_secret(project_id: str, secret_id: str) -> str:
client = secretmanager.SecretManagerServiceClient()
name = f"projects/{project_id}/secrets/{secret_id}/versions/latest"
response = client.access_secret_version(request={"name": name})
return response.payload.data.decode("UTF-8")
# HashiCorp Vault
import hvac
def get_vault_secret(path: str) -> dict:
client = hvac.Client(url='https://vault.example.com')
secret = client.secrets.kv.v2.read_secret_version(path=path)
return secret['data']['data']
Troubleshooting
| Issue | Solution |
|---|---|
| High false positive rate | Tune .gitleaks.toml allowlist |
| Scan too slow | Use --no-git for file-only scan |
| Git history scan fails | Ensure fetch-depth: 0 in CI |
| detect-secrets baseline outdated | detect-secrets scan --update .secrets.baseline |
| Pre-commit hook bypassed | Audit with --no-verify usage logs |
Performance Guidelines
- Incremental scans: < 10 seconds for PR changes
- Full repo scan: < 2 minutes for 100K LOC
- History scan: < 10 minutes for 10K commits
- Memory usage: < 500MB for typical repository
Compliance Mapping
| Requirement | Control |
|---|---|
| SOC 2 CC6.1 | Pre-commit hooks prevent credential commits |
| PCI-DSS 6.5.3 | Secrets scanning in CI/CD pipeline |
| HIPAA 164.312 | Audit logs for detected secrets |
| GDPR Art. 32 | Encryption key management verification |
References
- Gitleaks Documentation
- TruffleHog Documentation
- detect-secrets Documentation
- OWASP Secrets Management Cheat Sheet
Success Output
When this skill completes successfully, you should see:
✅ SKILL COMPLETE: secrets-detection
Completed:
- [x] Detection tools installed and verified (gitleaks, trufflehog, detect-secrets)
- [x] Baseline scan completed with findings classified
- [x] Pre-commit hooks configured and tested
- [x] CI/CD integration deployed
- [x] All true positive secrets rotated
Outputs:
- gitleaks-report.json - Detailed findings report
- .secrets.baseline - False positive whitelist
- .pre-commit-config.yaml - Pre-commit hook configuration
- .gitleaks.toml - Custom detection rules
- context-storage/audit-logs/secrets-scan-YYYY-MM-DD.json
Detection Coverage: 95%+ for common secret types
False Positive Rate: <5% with tuned configuration
Completion Checklist
Before marking this skill as complete, verify:
- All detection tools installed:
gitleaks version && trufflehog --version && detect-secrets --version - Baseline scan completed:
ls -lh gitleaks-report.json trufflehog-report.json .secrets.baseline - Findings reviewed and classified (true/false positives documented)
- All true positive secrets rotated and verified in cloud consoles
- .gitignore updated with secrets patterns
- Pre-commit hooks installed:
pre-commit run --all-files(passes) - CI/CD pipeline includes secrets scanning step
- Team notified of new pre-commit requirements
- Custom .gitleaks.toml rules configured if needed
- Detection coverage validated on test secrets
Failure Indicators
This skill has FAILED if:
- ❌ Detection tools fail to install or show version errors
- ❌ Baseline scan produces no output or crashes
- ❌ High false positive rate (>20%) without tuning
- ❌ True positive secrets not rotated before remediation
- ❌ Pre-commit hooks bypass secrets (--no-verify works without detection)
- ❌ CI/CD pipeline doesn't fail on secret detection
- ❌ Git history cleanup attempted without team coordination
- ❌ .gitignore missing critical secret file patterns
When NOT to Use
Do NOT use secrets-detection when:
- Full security audit needed - Use
security-auditskill for comprehensive OWASP scanning - Dependency vulnerabilities - Use
dependency-securityskill for package scanning - Runtime secrets management - Use cloud secret managers (AWS Secrets Manager, GCP Secret Manager, Vault)
- Encryption key generation - Use dedicated cryptographic tools
- Simple credential lookup - Use manual search if scope is very limited
- No git repository - Secrets detection relies on git history and file scanning
- Immediate deployment urgency - Setup requires 1-2 hours minimum for proper configuration
Alternative Approaches:
- Simple grep for one-time searches:
grep -r "AKIA" --include="*.py" - Cloud-native scanning: AWS Macie, GCP DLP API for production data
- Manual code review for small changes (<10 files)
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Running scan without classification | All findings treated equally, wasted remediation effort | Always classify true/false positives first |
| Rotating secrets before removal | Secret still in git history after rotation | Remove from code/history FIRST, then rotate |
| Skipping .gitignore update | Secrets reintroduced after cleanup | Add comprehensive ignore patterns |
| Bypassing pre-commit hooks | git commit --no-verify becomes habit | Enforce hooks, audit bypass usage |
| Git history cleanup without coordination | Team force-pull issues, lost work | Coordinate with team, use protected branches |
| Over-tuning allowlist | Real secrets missed | Only allowlist verified false positives |
| Ignoring high false positives | Tool becomes noise, ignored by team | Tune configuration to <5% false positive rate |
| No CI/CD integration | Pre-commit hooks bypassed in CI | Always add scanning to CI pipeline |
| Using only one tool | Single tool blind spots | Multi-engine approach (gitleaks + trufflehog + detect-secrets) |
| Forgetting test files | Test secrets exposed | Scan test directories, use clearly fake values |
Principles
This skill embodies these CODITECT principles:
- Defense in Depth - Multiple detection engines (gitleaks, trufflehog, detect-secrets) prevent single-tool blind spots
- Shift Left Security - Pre-commit hooks catch secrets before they reach remote
- Automation First - Automated scanning eliminates human error in detection
- No Assumptions - Classify every finding, never assume false positive without verification
- Clear, Explainable - Detection rules documented in .gitleaks.toml with rationale
- Separation of Concerns - Detection (this skill) separate from secret management (cloud providers)
- Recycle → Extend → Re-Use - Leverage industry-standard tools, extend with custom rules
- Continuous Improvement - Tune allowlists based on false positive analysis
Full Principles: CODITECT-STANDARD-AUTOMATION.md
Status: Production-ready Detection Rate: 95%+ for common secret types False Positive Rate: <5% with tuned configuration Integration: Pre-commit hooks, CI/CD pipelines