Skip to main content

Secrets Detection Skill

Secrets Detection Skill

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Production-ready secrets detection skill for identifying and preventing credential exposure in source code, configuration files, and git history. Implements defense-in-depth with multiple detection engines.

When to Use This Skill

Use secrets-detection when:

  • Setting up pre-commit hooks for new repositories
  • Auditing existing codebases for exposed credentials
  • Investigating potential security incidents
  • Preparing for compliance audits (SOC 2, PCI-DSS)
  • Onboarding third-party code or acquired repositories
  • Validating .gitignore and secrets management

Don't use secrets-detection when:

  • Full OWASP security audit needed (use security-audit skill)
  • Dependency vulnerability scanning (use dependency-security skill)
  • Runtime secrets management (use cloud secret managers)
  • Encryption key generation or rotation

Detection Coverage

Secret Types Detected

CategoryExamplesDetection Rate
API KeysAWS, GCP, Azure, Stripe, SendGrid98%
OAuth TokensGitHub, GitLab, Slack, Discord96%
Database CredentialsPostgreSQL, MySQL, MongoDB URIs95%
Private KeysRSA, SSH, PGP, SSL certificates99%
Cloud CredentialsService accounts, access keys97%
JWT TokensBearer tokens, refresh tokens92%
PasswordsHardcoded, config files, .env90%
WebhooksSlack, Discord, custom endpoints94%

File Types Scanned

  • Source code: .py, .js, .ts, .go, .rs, .java
  • Configuration: .yml, .yaml, .json, .toml, .ini
  • Environment: .env, .env.*, docker-compose.yml
  • Infrastructure: .tf, *.tfvars, cloudformation.yml
  • Documentation: .md (for accidentally pasted secrets)

Instructions

Phase 1: Tool Setup

Objective: Install and configure detection tools.

  1. Install Gitleaks (primary scanner):

    # macOS
    brew install gitleaks

    # Linux
    wget https://github.com/gitleaks/gitleaks/releases/latest/download/gitleaks_linux_x64.tar.gz
    tar -xzf gitleaks_linux_x64.tar.gz
    sudo mv gitleaks /usr/local/bin/

    # Verify
    gitleaks version
  2. Install TruffleHog (git history scanner):

    # Python
    pip install trufflehog

    # Docker (recommended for CI)
    docker pull trufflesecurity/trufflehog:latest

    # Verify
    trufflehog --version
  3. Install detect-secrets (baseline management):

    pip install detect-secrets

    # Verify
    detect-secrets --version

Phase 2: Baseline Scan

Objective: Establish current state and identify existing secrets.

  1. Run full repository scan:

    # Gitleaks scan (current state)
    gitleaks detect --source . --report-format json --report-path gitleaks-report.json

    # TruffleHog scan (including git history)
    trufflehog git file://. --json > trufflehog-report.json

    # detect-secrets baseline
    detect-secrets scan --all-files > .secrets.baseline
  2. Review findings:

    # Count findings by severity
    cat gitleaks-report.json | jq 'group_by(.RuleID) | map({rule: .[0].RuleID, count: length})'
  3. Classify findings:

    • True Positives: Actual secrets requiring rotation
    • False Positives: Test data, placeholders, examples
    • Already Rotated: Historical secrets, verify rotation

Phase 3: Remediation

Objective: Remove secrets and prevent future exposure.

  1. For active secrets:

    # Step 1: Rotate the secret (in cloud console/service)
    # Step 2: Update application to use secret manager
    # Step 3: Remove from code
    # Step 4: Consider git history cleanup (optional)
  2. Update .gitignore:

    # Secrets and credentials
    .env
    .env.*
    !.env.example
    *.pem
    *.key
    *credentials*.json
    *secret*.json
    .secrets.baseline
  3. Git history cleanup (if required):

    # WARNING: Rewrites history, coordinate with team
    # Use BFG Repo-Cleaner for large repos
    bfg --delete-files "*.pem"
    bfg --replace-text passwords.txt

    # Or git filter-branch for specific files
    git filter-branch --force --index-filter \
    'git rm --cached --ignore-unmatch path/to/secret.json' \
    --prune-empty --tag-name-filter cat -- --all

Phase 4: Prevention Setup

Objective: Implement pre-commit hooks and CI integration.

  1. Configure pre-commit hook:

    # .pre-commit-config.yaml
    repos:
    - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
    - id: gitleaks

    - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
    - id: detect-secrets
    args: ['--baseline', '.secrets.baseline']
  2. Install hooks:

    pre-commit install
    pre-commit run --all-files # Test on existing code
  3. CI/CD integration:

    # GitHub Actions
    secrets-scan:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    with:
    fetch-depth: 0 # Full history for TruffleHog

    - name: Gitleaks Scan
    uses: gitleaks/gitleaks-action@v2
    env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

    - name: TruffleHog Scan
    uses: trufflesecurity/trufflehog@main
    with:
    extra_args: --only-verified

Examples

Example 1: AWS Credentials Detection

Context: Scanning Python project for AWS access keys

Scan Command:

gitleaks detect --source . -v

Finding:

{
"Description": "AWS Access Key",
"StartLine": 15,
"EndLine": 15,
"StartColumn": 1,
"EndColumn": 40,
"Match": "AKIA...",
"Secret": "AKIAIOSFODNN7EXAMPLE",
"File": "config/settings.py",
"Commit": "abc123...",
"Entropy": 3.5,
"RuleID": "aws-access-key-id"
}

Remediation:

# BEFORE (insecure)
AWS_ACCESS_KEY = "AKIAIOSFODNN7EXAMPLE"
AWS_SECRET_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

# AFTER (secure)
import boto3
from botocore.config import Config

# Use IAM roles (preferred) or environment variables
# boto3 automatically checks:
# 1. Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
# 2. Shared credentials file (~/.aws/credentials)
# 3. IAM role for EC2/ECS/Lambda

client = boto3.client('s3') # Credentials resolved automatically

Example 2: Database URI with Embedded Password

Context: Docker Compose configuration

Finding:

# docker-compose.yml (INSECURE)
services:
app:
environment:
- DATABASE_URL=postgresql://admin:SuperSecret123!@db:5432/myapp

Remediation:

# docker-compose.yml (SECURE)
services:
app:
environment:
- DATABASE_URL=postgresql://${DB_USER}:${DB_PASSWORD}@db:5432/myapp
env_file:
- .env # Not committed to git

# .env (in .gitignore)
DB_USER=admin
DB_PASSWORD=SuperSecret123!

# .env.example (committed, for documentation)
DB_USER=your_username
DB_PASSWORD=your_password

Example 3: Git History Secret Exposure

Context: API key committed 6 months ago, since removed from code

Scan Command:

trufflehog git file://. --since-commit=HEAD~100

Finding:

{
"SourceMetadata": {
"Data": {
"Git": {
"commit": "a1b2c3d4...",
"file": "src/api/client.ts",
"line": 12,
"timestamp": "2024-06-15T10:30:00Z"
}
}
},
"DetectorName": "Stripe",
"Verified": true,
"Raw": "sk_live_51..."
}

Remediation:

  1. Immediate: Rotate the Stripe API key in dashboard
  2. Short-term: Update application to use Stripe secret manager
  3. Optional: Clean git history with BFG if required by compliance

Gitleaks Configuration

Custom Rules (.gitleaks.toml)

title = "Custom Gitleaks Config"

[allowlist]
description = "Global allowlist"
paths = [
'''(^|/)\.secrets\.baseline$''',
'''(^|/)test[s]?/''',
'''(^|/)__mocks__/''',
'''\.example$''',
]

[[rules]]
id = "custom-internal-api-key"
description = "Internal API Key Pattern"
regex = '''INTERNAL_API_KEY[=:]\s*['"]?([A-Za-z0-9_-]{32,})['"]?'''
entropy = 3.5
secretGroup = 1
tags = ["internal", "api-key"]

[[rules]]
id = "custom-jwt-token"
description = "JWT Token Detection"
regex = '''eyJ[A-Za-z0-9_-]*\.eyJ[A-Za-z0-9_-]*\.[A-Za-z0-9_-]*'''
tags = ["jwt", "token"]

Allowlist Management

# Allowlist specific files
[[rules.allowlist]]
paths = [
'''tests/fixtures/''',
'''docs/examples/''',
]

# Allowlist specific patterns (false positives)
[[rules.allowlist]]
regexes = [
'''EXAMPLE_KEY''',
'''PLACEHOLDER''',
'''sk_test_''', # Stripe test keys are safe
]

# Allowlist specific commits (already remediated)
[[rules.allowlist]]
commits = [
"abc123def456...",
]

Integration

  • Skill: security-audit - Full OWASP security scanning
  • Skill: dependency-security - Vulnerable package detection
  • Hook: pre-commit-hook - Git pre-commit integration
  • Agent: security-specialist - Security review orchestration

Secret Manager Integration

# AWS Secrets Manager
import boto3

def get_secret(secret_name: str) -> str:
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId=secret_name)
return response['SecretString']

# GCP Secret Manager
from google.cloud import secretmanager

def get_gcp_secret(project_id: str, secret_id: str) -> str:
client = secretmanager.SecretManagerServiceClient()
name = f"projects/{project_id}/secrets/{secret_id}/versions/latest"
response = client.access_secret_version(request={"name": name})
return response.payload.data.decode("UTF-8")

# HashiCorp Vault
import hvac

def get_vault_secret(path: str) -> dict:
client = hvac.Client(url='https://vault.example.com')
secret = client.secrets.kv.v2.read_secret_version(path=path)
return secret['data']['data']

Troubleshooting

IssueSolution
High false positive rateTune .gitleaks.toml allowlist
Scan too slowUse --no-git for file-only scan
Git history scan failsEnsure fetch-depth: 0 in CI
detect-secrets baseline outdateddetect-secrets scan --update .secrets.baseline
Pre-commit hook bypassedAudit with --no-verify usage logs

Performance Guidelines

  • Incremental scans: < 10 seconds for PR changes
  • Full repo scan: < 2 minutes for 100K LOC
  • History scan: < 10 minutes for 10K commits
  • Memory usage: < 500MB for typical repository

Compliance Mapping

RequirementControl
SOC 2 CC6.1Pre-commit hooks prevent credential commits
PCI-DSS 6.5.3Secrets scanning in CI/CD pipeline
HIPAA 164.312Audit logs for detected secrets
GDPR Art. 32Encryption key management verification

References


Success Output

When this skill completes successfully, you should see:

✅ SKILL COMPLETE: secrets-detection

Completed:
- [x] Detection tools installed and verified (gitleaks, trufflehog, detect-secrets)
- [x] Baseline scan completed with findings classified
- [x] Pre-commit hooks configured and tested
- [x] CI/CD integration deployed
- [x] All true positive secrets rotated

Outputs:
- gitleaks-report.json - Detailed findings report
- .secrets.baseline - False positive whitelist
- .pre-commit-config.yaml - Pre-commit hook configuration
- .gitleaks.toml - Custom detection rules
- context-storage/audit-logs/secrets-scan-YYYY-MM-DD.json

Detection Coverage: 95%+ for common secret types
False Positive Rate: <5% with tuned configuration

Completion Checklist

Before marking this skill as complete, verify:

  • All detection tools installed: gitleaks version && trufflehog --version && detect-secrets --version
  • Baseline scan completed: ls -lh gitleaks-report.json trufflehog-report.json .secrets.baseline
  • Findings reviewed and classified (true/false positives documented)
  • All true positive secrets rotated and verified in cloud consoles
  • .gitignore updated with secrets patterns
  • Pre-commit hooks installed: pre-commit run --all-files (passes)
  • CI/CD pipeline includes secrets scanning step
  • Team notified of new pre-commit requirements
  • Custom .gitleaks.toml rules configured if needed
  • Detection coverage validated on test secrets

Failure Indicators

This skill has FAILED if:

  • ❌ Detection tools fail to install or show version errors
  • ❌ Baseline scan produces no output or crashes
  • ❌ High false positive rate (>20%) without tuning
  • ❌ True positive secrets not rotated before remediation
  • ❌ Pre-commit hooks bypass secrets (--no-verify works without detection)
  • ❌ CI/CD pipeline doesn't fail on secret detection
  • ❌ Git history cleanup attempted without team coordination
  • ❌ .gitignore missing critical secret file patterns

When NOT to Use

Do NOT use secrets-detection when:

  • Full security audit needed - Use security-audit skill for comprehensive OWASP scanning
  • Dependency vulnerabilities - Use dependency-security skill for package scanning
  • Runtime secrets management - Use cloud secret managers (AWS Secrets Manager, GCP Secret Manager, Vault)
  • Encryption key generation - Use dedicated cryptographic tools
  • Simple credential lookup - Use manual search if scope is very limited
  • No git repository - Secrets detection relies on git history and file scanning
  • Immediate deployment urgency - Setup requires 1-2 hours minimum for proper configuration

Alternative Approaches:

  • Simple grep for one-time searches: grep -r "AKIA" --include="*.py"
  • Cloud-native scanning: AWS Macie, GCP DLP API for production data
  • Manual code review for small changes (<10 files)

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Running scan without classificationAll findings treated equally, wasted remediation effortAlways classify true/false positives first
Rotating secrets before removalSecret still in git history after rotationRemove from code/history FIRST, then rotate
Skipping .gitignore updateSecrets reintroduced after cleanupAdd comprehensive ignore patterns
Bypassing pre-commit hooksgit commit --no-verify becomes habitEnforce hooks, audit bypass usage
Git history cleanup without coordinationTeam force-pull issues, lost workCoordinate with team, use protected branches
Over-tuning allowlistReal secrets missedOnly allowlist verified false positives
Ignoring high false positivesTool becomes noise, ignored by teamTune configuration to <5% false positive rate
No CI/CD integrationPre-commit hooks bypassed in CIAlways add scanning to CI pipeline
Using only one toolSingle tool blind spotsMulti-engine approach (gitleaks + trufflehog + detect-secrets)
Forgetting test filesTest secrets exposedScan test directories, use clearly fake values

Principles

This skill embodies these CODITECT principles:

  1. Defense in Depth - Multiple detection engines (gitleaks, trufflehog, detect-secrets) prevent single-tool blind spots
  2. Shift Left Security - Pre-commit hooks catch secrets before they reach remote
  3. Automation First - Automated scanning eliminates human error in detection
  4. No Assumptions - Classify every finding, never assume false positive without verification
  5. Clear, Explainable - Detection rules documented in .gitleaks.toml with rationale
  6. Separation of Concerns - Detection (this skill) separate from secret management (cloud providers)
  7. Recycle → Extend → Re-Use - Leverage industry-standard tools, extend with custom rules
  8. Continuous Improvement - Tune allowlists based on false positive analysis

Full Principles: CODITECT-STANDARD-AUTOMATION.md


Status: Production-ready Detection Rate: 95%+ for common secret types False Positive Rate: <5% with tuned configuration Integration: Pre-commit hooks, CI/CD pipelines