Smart Merge Skill
Smart Merge Skill
When to Use This Skill
Use this skill when implementing smart merge patterns in your codebase.
How to Use This Skill
- Review the patterns and examples below
- Apply the relevant patterns to your implementation
- Follow the best practices outlined in this skill
Name: smart-merge Version: 1.0.0 Category: Documentation Status: Active
Overview
Intelligent document merging patterns for handling similar files, version conflicts, and documentation consolidation with LLM-assisted analysis.
Activation
# Activate skill
python3 scripts/update-component-activation.py activate skill smart-merge --reason "Document merge workflows"
Capabilities
1. Document Similarity Analysis
Analyze two documents for structural and content similarity:
python3 scripts/smart-merge.py analyze FILE_A FILE_B [--json]
Output includes:
- Line counts and section counts
- SHA-256 content hashes
- Identical vs different sections
- Similarity score (0-100%)
- Merge recommendation
2. Smart Merging
Merge documents with conflict resolution:
python3 scripts/smart-merge.py merge FILE_A FILE_B \
-o OUTPUT \
--strategy {smart|prefer_a|prefer_b|longer} \
[--llm] \
[--dry-run]
Strategies:
| Strategy | Behavior |
|---|---|
smart | LLM-powered analysis for best merge |
prefer_a | Always use first file's content |
prefer_b | Always use second file's content |
longer | Use longer version of conflicting sections |
3. Duplicate Detection
Find similar files across a directory:
python3 scripts/smart-merge.py find DIRECTORY \
--pattern "*.md" \
--threshold 0.5 \
[--json]
Merge Algorithm
┌─────────────────────────────────────────────────────────────┐
│ SMART MERGE WORKFLOW │
└─────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 1. EXTRACTION │
│ - Parse markdown structure │
│ - Extract sections by heading │
│ - Hash each section's content │
│ - Extract metadata (version, date, etc) │
└──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 2. COMPARISON │
│ - Match sections by title │
│ - Categorize: identical/different/unique │
│ - Calculate similarity score │
└──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 3. CONFLICT RESOLUTION │
│ - Unique sections: Include all │
│ - Identical: Keep one copy │
│ - Different: Apply merge strategy │
│ - smart → LLM analysis │
│ - prefer_a/b → Keep specified │
│ - longer → Keep larger content │
└──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 4. METADATA RECONCILIATION │
│ - Version: Use highest │
│ - Date: Use most recent │
│ - Status: Prefer Implemented > Accepted │
│ - Author: Merge unique authors │
└──────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 5. OUTPUT GENERATION │
│ - Assemble merged document │
│ - Add merge metadata footer │
│ - Write to output file │
└──────────────────────────────────────────┘
LLM Integration
When --llm flag is used with smart strategy:
# Requires ANTHROPIC_API_KEY environment variable
export ANTHROPIC_API_KEY="sk-..."
# LLM analyzes conflicting sections and returns:
{
"merged_content": "Best merged version...",
"reasoning": "Why this merge was chosen",
"confidence": 0.95
}
Model used: claude-sonnet-4-20250514 (configurable)
Best Practices
When to Merge
| Situation | Action |
|---|---|
| Files have same name in different locations | Analyze similarity first |
| Similarity > 80% | Safe to auto-merge |
| Similarity 50-80% | Review merge output |
| Similarity < 50% | May be different documents |
| Both have unique sections | Always merge to preserve content |
Pre-Merge Checklist
- Backup originals (git history counts)
- Run analysis first (
--dry-run) - Review recommendation
- Choose appropriate strategy
- Verify merged output
Post-Merge Actions
- Delete redundant file (after verification)
- Update references to point to merged file
- Commit with descriptive message
- Update any indexes/registries
Patterns
Pattern 1: ADR Consolidation
# Find all ADR duplicates
python3 scripts/smart-merge.py find docs/ --pattern "ADR*.md" --threshold 0.5
# Analyze specific pair
python3 scripts/smart-merge.py analyze \
docs/draft/ADR-001-feature.md \
docs/03-architecture/adrs/ADR-001-feature.md
# Merge with LLM assistance
python3 scripts/smart-merge.py merge \
docs/draft/ADR-001-feature.md \
docs/03-architecture/adrs/ADR-001-feature.md \
-o docs/03-architecture/adrs/ADR-001-feature.md \
--strategy smart --llm
# Clean up old version (after verification)
git rm docs/draft/ADR-001-feature.md
git commit -m "docs: Consolidate ADR-001-feature into canonical location"
Pattern 2: Documentation Version Merge
# When you have v1 and v2 of same doc
python3 scripts/smart-merge.py merge \
docs/guide-v1.md \
docs/guide-v2.md \
-o docs/guide.md \
--strategy longer
# Verify and clean up
git add docs/guide.md
git rm docs/guide-v1.md docs/guide-v2.md
git commit -m "docs: Merge guide versions"
Pattern 3: Batch Duplicate Cleanup
# Find all duplicates
python3 scripts/smart-merge.py find . --pattern "*.md" --threshold 0.8 --json > duplicates.json
# Process each pair (pseudo-code for automation)
for pair in $(cat duplicates.json | jq -r '.[] | @base64'); do
file_a=$(echo $pair | base64 -d | jq -r '.file_a')
file_b=$(echo $pair | base64 -d | jq -r '.file_b')
python3 scripts/smart-merge.py merge "$file_a" "$file_b" -o "$file_a" --strategy longer
rm "$file_b"
done
Integration
With /cx /cxq
# Find documents mentioning same topic
/cxq "export dedup system" --files
# May reveal duplicates to merge
With git hooks
# Pre-commit hook can detect similar files being added
# See hooks/smart-merge.md for configuration
With document-merger agent
Task(
subagent_type="document-merger",
prompt="Find and consolidate all duplicate ADRs in docs/"
)
Troubleshooting
| Issue | Solution |
|---|---|
| "LLM not available" | Set ANTHROPIC_API_KEY or use --strategy longer |
| "No sections found" | File may not use markdown headings |
| "Binary file" | Skip - smart-merge is for text only |
| Low similarity despite same name | Check if files are actually related |
Success Output
When successfully applying smart merge patterns:
✅ SKILL COMPLETE: smart-merge
Completed:
- [x] Document similarity analysis performed (SHA-256 hashes, section comparison)
- [x] Similarity score calculated (0-100%) with merge recommendation
- [x] Merge strategy selected (smart/prefer_a/prefer_b/longer)
- [x] Conflict resolution applied to different sections
- [x] Metadata reconciliation completed (version, date, status, authors)
- [x] Merged document generated with merge metadata footer
Outputs:
- Similarity analysis report (if --dry-run or analyze command)
- Merged document file at specified output path
- Merge metadata footer documenting sources and strategy used
- Duplicate detection report (if find command used)
Completion Checklist
Before marking this skill as complete, verify:
- Document similarity analysis completed (hashes, section counts, similarity score)
- Merge strategy appropriate for similarity level (>80% auto-merge, 50-80% review, <50% manual)
- Unique sections from both files included in merge
- Identical sections deduplicated (only one copy kept)
- Different sections resolved per strategy (smart/prefer_a/prefer_b/longer)
- Metadata reconciliation applied (highest version, most recent date, preferred status)
- Merge metadata footer added to output documenting sources
- Output file written to correct path
- Original files preserved (not deleted) until verification
Failure Indicators
This skill has FAILED if:
- ❌ Similarity analysis returns 0% for clearly related documents (hash mismatch)
- ❌ Merge strategy "smart" used without LLM available (ANTHROPIC_API_KEY not set)
- ❌ Unique sections from one file lost in merge
- ❌ Identical sections duplicated instead of deduplicated
- ❌ Metadata not reconciled (e.g., older version number selected)
- ❌ Output file missing merge metadata footer
- ❌ Binary file passed to smart-merge (text-only tool)
- ❌ Original files deleted before merge verification
When NOT to Use
Do NOT use this skill when:
- Files are completely different content (similarity <30%)
- Binary files (images, PDFs, compiled code)
- Files with same name but unrelated purpose
- Structural merge needed (code with git conflicts)
- Real-time collaborative editing (use operational transforms)
Use alternatives instead:
- For unrelated files → Keep separate, do not merge
- For binary files → Use binary diff tools or manual review
- For git conflicts → Use
git mergewith conflict resolution - For code merging → Use semantic merge tools (e.g., SemanticMerge)
- For collaborative editing → Use CRDTs or operational transforms
Anti-Patterns (Avoid)
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Auto-merging low similarity (<50%) | Loses context, creates confusion | Review merge output or keep separate |
| Using --llm without API key | Fails with error, blocks workflow | Set ANTHROPIC_API_KEY or use "longer" strategy |
| Deleting originals before verification | Can't recover if merge incorrect | Verify merged output first |
| Ignoring merge recommendation | May merge unrelated files | Follow similarity-based recommendations |
| No section heading structure | Smart-merge can't extract sections | Ensure markdown files use proper headings |
| Batch merging without review | Cascading errors across files | Review each merge, especially <80% similarity |
Principles
This skill embodies CODITECT automation principles:
- #1 Recycle → Extend → Re-Use → Create - Reuses existing document sections instead of duplicating content
- #4 Keep It Simple - Provides clear merge strategies (smart/prefer_a/prefer_b/longer)
- #5 Eliminate Ambiguity - Similarity score and recommendation eliminate guesswork
- #6 Clear, Understandable, Explainable - Merge metadata footer documents exactly what was merged and how
- #8 No Assumptions - Analyzes similarity first, recommends strategy, requires verification
- #9 Research When in Doubt - LLM integration for "smart" strategy resolves ambiguous conflicts
Full Standard: CODITECT-STANDARD-AUTOMATION.md
Related Components
- Script:
scripts/smart-merge.py - Agent:
agents/document-merger.md - Command:
commands/smart-merge.md - Hook:
hooks/smart-merge.md
Author: CODITECT Core Team Created: 2025-12-11