Skip to main content

AI-Powered Intelligent Commit-Task Linking

๐Ÿค– Overviewโ€‹

Upgrade from TF-IDF keyword matching to AI semantic understanding using LLM APIs (Anthropic Claude, OpenAI GPT-4).

Why AI is Betterโ€‹

TF-IDF (Current):

Commit: "Add JWT auth system"
Task: "Implement user authentication with JWT"
Match: 72% (keyword overlap)

AI-Powered (Enhanced):

Commit: "Add JWT auth system"
Task: "Implement user authentication with JWT"
Match: 95% โœ…
Reasoning: "Commit implements JWT authentication mentioned in task.
Code changes show token generation and validation middleware."

Key Improvementsโ€‹

  1. Semantic Understanding

    • Understands "Add JWT" = "Implement authentication"
    • Not just keyword matching
  2. Context Awareness

    • Reads code diffs to determine intent
    • Understands multi-file changes
  3. Synonym Handling

    • "Fix bug" = "Resolve issue" = "Debug problem"
    • "Add feature" = "Implement functionality"
  4. Multi-Language

    • Works with any programming language
    • Understands technical jargon
  5. Explanation

    • AI explains WHY commit matches task
    • Helps verify correctness

๐Ÿš€ Setupโ€‹

1. Install Dependenciesโ€‹

cd docs/dashboard-2.0/poc/backend

# Install AI SDK
pip install anthropic # For Claude
# OR
pip install openai # For GPT-4

2. Set API Keyโ€‹

Option A: Environment Variable

# Anthropic Claude (recommended)
export ANTHROPIC_API_KEY="sk-ant-api03-xxx"

# OR OpenAI GPT-4
export OPENAI_API_KEY="sk-xxx"
export AI_PROVIDER="openai"

Option B: .env File

# Create .env file
cat > backend/.env << EOF
ANTHROPIC_API_KEY=sk-ant-api03-xxx
AI_PROVIDER=anthropic
AI_CONFIDENCE_THRESHOLD=0.3
EOF

# Load in Python
pip install python-dotenv

3. Update API to Use AI Linkerโ€‹

Edit backend/api_v2.py:

# Add at top of file
from ai_linker import AITaskLinker

# Replace this line (around line 600):
from linkers import CommitTaskLinker

# With:
try:
from ai_linker import AITaskLinker as CommitTaskLinker
print("โœ… AI-powered linking enabled")
except ImportError:
from linkers import CommitTaskLinker
print("โš ๏ธ Using TF-IDF fallback (no AI)")

That's it! Restart API and AI linking is active.


๐Ÿงช Test AI Linkingโ€‹

CLI Testโ€‹

cd backend

# Set API key
export ANTHROPIC_API_KEY="sk-ant-api03-xxx"

# Test commit matching
python ai_linker.py "feat: Implement JWT authentication with RS256 signing"

Output:

๐Ÿค– Analyzing commit: feat: Implement JWT authentication with RS256 signing

โœ… Found 2 matching tasks:
Task #1091: 95.0% confidence
Reasoning: Commit directly implements JWT authentication system mentioned in task title.
The RS256 signing detail indicates production-ready implementation.

Task #1093: 78.0% confidence
Reasoning: JWT authentication is part of the security infrastructure setup mentioned in this task.

API Testโ€‹

# Send commit with AI linking
curl -X POST http://localhost:5001/api/v1/git/commits \
-H "Content-Type: application/json" \
-d '{
"sha": "ai-test-001",
"message": "feat: Add JWT authentication system\n\nImplemented token generation with RS256 signing and validation middleware.",
"author": "Hal Casteel",
"timestamp": "2025-11-28T01:00:00Z",
"repo_name": "dashboard-2.0-poc"
}' | jq '.'

Response:

{
"commit_sha": "ai-test-001",
"links_created": 2,
"links": [
{
"task_id": 1091,
"task_title": "Implement user authentication system with JWT",
"confidence": 0.95,
"link_type": "ai_inferred",
"evidence": "Commit implements JWT authentication mentioned in task",
"ai_provider": "anthropic"
},
{
"task_id": 1093,
"task_title": "Configure security infrastructure",
"confidence": 0.78,
"link_type": "ai_inferred",
"evidence": "JWT is part of security setup",
"ai_provider": "anthropic"
}
]
}

๐ŸŽฏ Advanced Featuresโ€‹

1. Code Diff Analysisโ€‹

AI can read code diffs to better understand commits:

from ai_linker import AITaskLinker

linker = AITaskLinker()

# Get git diff
diff = """
diff --git a/auth.py b/auth.py
+import jwt
+def generate_token(user_id):
+ return jwt.encode({'user_id': user_id}, secret, algorithm='RS256')
"""

links = linker.link_commit_to_tasks(
commit_sha="abc123",
commit_message="feat: Add authentication",
commit_diff=diff # AI analyzes code changes!
)

AI sees actual code and understands:

  • JWT library imported
  • Token generation implemented
  • RS256 algorithm used
  • Matches task more accurately

2. AI-Generated Commit Messagesโ€‹

Let AI write commit messages for you:

from ai_linker import AITaskLinker

linker = AITaskLinker()

# Generate commit message from task
message = linker.generate_commit_message(
task_id=1091,
code_diff=git_diff # Optional: AI reads your changes
)

print(message)

Output:

feat(auth): Implement JWT authentication #1091

Added RS256 token signing and validation middleware for secure user authentication.

Features:
- Token generation with configurable expiry (default 24h)
- Validation middleware for protected routes
- Refresh token support for extended sessions
- Proper error handling for invalid/expired tokens

Closes #1091

Use in git hook:

# .git/hooks/prepare-commit-msg
TASK_ID=$(git branch --show-current | grep -oE '[0-9]+')
if [ -n "$TASK_ID" ]; then
# Let AI generate message
AI_MESSAGE=$(python backend/ai_linker.py --generate $TASK_ID)
echo "$AI_MESSAGE" > "$1"
fi

3. AI Task Suggestionsโ€‹

AI suggests creating tasks for commits without references:

suggestion = linker.suggest_task_from_commit(
commit_message="fix: Resolve database timeout issues",
commit_diff=diff
)

if suggestion['should_create_task']:
task = suggestion['suggested_task']
print(f"Suggested task: {task['title']}")
print(f"Description: {task['description']}")
print(f"Complexity: {task['complexity']}")
print(f"Priority: {task['priority']}")

Output:

{
"should_create_task": true,
"suggested_task": {
"title": "Fix database connection pool timeout",
"description": "Connection pool exhaustion causing timeouts under load. Need to increase pool size and add connection recycling.",
"complexity": "M",
"priority": "P1",
"reasoning": "This is a production bug affecting system reliability"
}
}

โš™๏ธ Configurationโ€‹

Environment Variablesโ€‹

# AI Provider (anthropic or openai)
AI_PROVIDER=anthropic

# API Keys
ANTHROPIC_API_KEY=sk-ant-api03-xxx
OPENAI_API_KEY=sk-xxx

# Confidence threshold (0.0-1.0)
# Lower = more matches, Higher = stricter
AI_CONFIDENCE_THRESHOLD=0.3

# Model selection
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
OPENAI_MODEL=gpt-4-turbo-preview

# Enable/disable diff analysis (costs more tokens)
AI_ANALYZE_DIFFS=true

# Max tasks to analyze per commit
AI_MAX_CANDIDATE_TASKS=100

Cost Optimizationโ€‹

Token Usage:

  • Without diff: 500-1000 tokens per commit ($0.003-$0.006)
  • With diff: 2000-4000 tokens per commit ($0.012-$0.024)

Best practices:

  1. Only analyze diffs for important commits (main branch)
  2. Cache AI results to avoid re-analyzing same commit
  3. Use cheaper models for simple matches (claude-3-haiku)
  4. Batch commits to reduce API calls

Example cost:

100 commits/day ร— $0.006 = $0.60/day = $18/month

Much cheaper than manual task linking!


๐Ÿ“Š Comparison: TF-IDF vs AIโ€‹

Example 1: Synonymsโ€‹

Commit: "Refactor authentication logic" Task: "Implement user login system"

MethodMatchConfidence
TF-IDFโŒ No match0% (no keyword overlap)
AIโœ… Matched82% (understands synonyms)

AI Reasoning: "Refactoring authentication logic is part of implementing the login system"

Example 2: Contextโ€‹

Commit: "Add RS256 key generation" Task: "Secure JWT implementation"

MethodMatchConfidence
TF-IDFโŒ No match15% (weak keyword match)
AIโœ… Matched88% (understands context)

AI Reasoning: "RS256 is the secure algorithm for JWT token signing mentioned in task requirements"

Example 3: Multi-Taskโ€‹

Commit: "Update authentication and add rate limiting" Tasks:

  • #1091: "Implement authentication"
  • #1095: "Add API rate limiting"
MethodMatchesConfidence
TF-IDF1 task#1091: 65%
AI2 tasks#1091: 92%, #1095: 87%

AI Reasoning: "Commit addresses both authentication (primary) and rate limiting (secondary) tasks"


๐Ÿ”ง Troubleshootingโ€‹

AI Not Workingโ€‹

Problem: API using TF-IDF fallback

Solutions:

  1. Check API key: echo $ANTHROPIC_API_KEY
  2. Verify SDK installed: pip show anthropic
  3. Check API logs for errors
  4. Test CLI: python ai_linker.py "test message"

Low Confidence Scoresโ€‹

Problem: AI giving low confidence (<0.5) for obvious matches

Solutions:

  1. Lower threshold: AI_CONFIDENCE_THRESHOLD=0.2
  2. Improve task descriptions (more detail = better matching)
  3. Include code diffs for better context
  4. Use more specific commit messages

High API Costsโ€‹

Problem: Token usage too high

Solutions:

  1. Disable diff analysis: AI_ANALYZE_DIFFS=false
  2. Use cheaper model: ANTHROPIC_MODEL=claude-3-haiku-20240307
  3. Reduce candidates: AI_MAX_CANDIDATE_TASKS=50
  4. Only use AI for main branch commits

๐Ÿš€ Next Stepsโ€‹

  1. Set API key - Get Anthropic or OpenAI key
  2. Install dependencies - pip install anthropic
  3. Update api_v2.py - Import AITaskLinker
  4. Restart API - API will use AI automatically
  5. Test it - Make a commit and watch AI link it!

๐Ÿ’ก Pro Tipsโ€‹

1. Branch-Based AI Strategyโ€‹

# .git/hooks/post-commit
BRANCH=$(git branch --show-current)

if [[ "$BRANCH" == "main" ]]; then
# Use AI for main branch (important commits)
curl -X POST "$API_URL/api/v1/git/commits?use_ai=true" ...
else
# Use TF-IDF for feature branches (cheaper)
curl -X POST "$API_URL/api/v1/git/commits?use_ai=false" ...
fi

2. Commit Message Templatesโ€‹

Train your team to write AI-friendly commits:

# Good for AI
feat(auth): Implement JWT token validation middleware

Added middleware to validate JWT tokens on protected routes.
Uses RS256 algorithm with public key verification.

# Bad for AI
fix stuff

updated files

Dashboard shows AI reasoning - review periodically:

  • Click task โ†’ see linked commits
  • Check "AI Reasoning" for each link
  • Unlink if AI was wrong
  • Improves over time with feedback

API Key: Add to .env file Cost: ~$18/month for 100 commits/day Accuracy: 85-95% vs 60-70% with TF-IDF

Ready to make your dashboard intelligently track progress! ๐Ÿง โœจ