Skip to main content

Smart Merge Skill

Smart Merge Skill

When to Use This Skill

Use this skill when implementing smart merge patterns in your codebase.

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Name: smart-merge Version: 1.0.0 Category: Documentation Status: Active

Overview

Intelligent document merging patterns for handling similar files, version conflicts, and documentation consolidation with LLM-assisted analysis.

Activation

# Activate skill
python3 scripts/update-component-activation.py activate skill smart-merge --reason "Document merge workflows"

Capabilities

1. Document Similarity Analysis

Analyze two documents for structural and content similarity:

python3 scripts/smart-merge.py analyze FILE_A FILE_B [--json]

Output includes:

  • Line counts and section counts
  • SHA-256 content hashes
  • Identical vs different sections
  • Similarity score (0-100%)
  • Merge recommendation

2. Smart Merging

Merge documents with conflict resolution:

python3 scripts/smart-merge.py merge FILE_A FILE_B \
-o OUTPUT \
--strategy {smart|prefer_a|prefer_b|longer} \
[--llm] \
[--dry-run]

Strategies:

StrategyBehavior
smartLLM-powered analysis for best merge
prefer_aAlways use first file's content
prefer_bAlways use second file's content
longerUse longer version of conflicting sections

3. Duplicate Detection

Find similar files across a directory:

python3 scripts/smart-merge.py find DIRECTORY \
--pattern "*.md" \
--threshold 0.5 \
[--json]

Merge Algorithm

┌─────────────────────────────────────────────────────────────┐
│ SMART MERGE WORKFLOW │
└─────────────────────────────────────────────────────────────┘


┌──────────────────────────────────────────┐
│ 1. EXTRACTION │
│ - Parse markdown structure │
│ - Extract sections by heading │
│ - Hash each section's content │
│ - Extract metadata (version, date, etc) │
└──────────────────────────────────────────┘


┌──────────────────────────────────────────┐
│ 2. COMPARISON │
│ - Match sections by title │
│ - Categorize: identical/different/unique │
│ - Calculate similarity score │
└──────────────────────────────────────────┘


┌──────────────────────────────────────────┐
│ 3. CONFLICT RESOLUTION │
│ - Unique sections: Include all │
│ - Identical: Keep one copy │
│ - Different: Apply merge strategy │
│ - smart → LLM analysis │
│ - prefer_a/b → Keep specified │
│ - longer → Keep larger content │
└──────────────────────────────────────────┘


┌──────────────────────────────────────────┐
│ 4. METADATA RECONCILIATION │
│ - Version: Use highest │
│ - Date: Use most recent │
│ - Status: Prefer Implemented > Accepted │
│ - Author: Merge unique authors │
└──────────────────────────────────────────┘


┌──────────────────────────────────────────┐
│ 5. OUTPUT GENERATION │
│ - Assemble merged document │
│ - Add merge metadata footer │
│ - Write to output file │
└──────────────────────────────────────────┘

LLM Integration

When --llm flag is used with smart strategy:

# Requires ANTHROPIC_API_KEY environment variable
export ANTHROPIC_API_KEY="sk-..."

# LLM analyzes conflicting sections and returns:
{
"merged_content": "Best merged version...",
"reasoning": "Why this merge was chosen",
"confidence": 0.95
}

Model used: claude-sonnet-4-20250514 (configurable)

Best Practices

When to Merge

SituationAction
Files have same name in different locationsAnalyze similarity first
Similarity > 80%Safe to auto-merge
Similarity 50-80%Review merge output
Similarity < 50%May be different documents
Both have unique sectionsAlways merge to preserve content

Pre-Merge Checklist

  1. Backup originals (git history counts)
  2. Run analysis first (--dry-run)
  3. Review recommendation
  4. Choose appropriate strategy
  5. Verify merged output

Post-Merge Actions

  1. Delete redundant file (after verification)
  2. Update references to point to merged file
  3. Commit with descriptive message
  4. Update any indexes/registries

Patterns

Pattern 1: ADR Consolidation

# Find all ADR duplicates
python3 scripts/smart-merge.py find docs/ --pattern "ADR*.md" --threshold 0.5

# Analyze specific pair
python3 scripts/smart-merge.py analyze \
docs/draft/ADR-001-feature.md \
docs/03-architecture/adrs/ADR-001-feature.md

# Merge with LLM assistance
python3 scripts/smart-merge.py merge \
docs/draft/ADR-001-feature.md \
docs/03-architecture/adrs/ADR-001-feature.md \
-o docs/03-architecture/adrs/ADR-001-feature.md \
--strategy smart --llm

# Clean up old version (after verification)
git rm docs/draft/ADR-001-feature.md
git commit -m "docs: Consolidate ADR-001-feature into canonical location"

Pattern 2: Documentation Version Merge

# When you have v1 and v2 of same doc
python3 scripts/smart-merge.py merge \
docs/guide-v1.md \
docs/guide-v2.md \
-o docs/guide.md \
--strategy longer

# Verify and clean up
git add docs/guide.md
git rm docs/guide-v1.md docs/guide-v2.md
git commit -m "docs: Merge guide versions"

Pattern 3: Batch Duplicate Cleanup

# Find all duplicates
python3 scripts/smart-merge.py find . --pattern "*.md" --threshold 0.8 --json > duplicates.json

# Process each pair (pseudo-code for automation)
for pair in $(cat duplicates.json | jq -r '.[] | @base64'); do
file_a=$(echo $pair | base64 -d | jq -r '.file_a')
file_b=$(echo $pair | base64 -d | jq -r '.file_b')

python3 scripts/smart-merge.py merge "$file_a" "$file_b" -o "$file_a" --strategy longer
rm "$file_b"
done

Integration

With /cx /cxq

# Find documents mentioning same topic
/cxq "export dedup system" --files

# May reveal duplicates to merge

With git hooks

# Pre-commit hook can detect similar files being added
# See hooks/smart-merge.md for configuration

With document-merger agent

Task(
subagent_type="document-merger",
prompt="Find and consolidate all duplicate ADRs in docs/"
)

Troubleshooting

IssueSolution
"LLM not available"Set ANTHROPIC_API_KEY or use --strategy longer
"No sections found"File may not use markdown headings
"Binary file"Skip - smart-merge is for text only
Low similarity despite same nameCheck if files are actually related

Success Output

When successfully applying smart merge patterns:

✅ SKILL COMPLETE: smart-merge

Completed:
- [x] Document similarity analysis performed (SHA-256 hashes, section comparison)
- [x] Similarity score calculated (0-100%) with merge recommendation
- [x] Merge strategy selected (smart/prefer_a/prefer_b/longer)
- [x] Conflict resolution applied to different sections
- [x] Metadata reconciliation completed (version, date, status, authors)
- [x] Merged document generated with merge metadata footer

Outputs:
- Similarity analysis report (if --dry-run or analyze command)
- Merged document file at specified output path
- Merge metadata footer documenting sources and strategy used
- Duplicate detection report (if find command used)

Completion Checklist

Before marking this skill as complete, verify:

  • Document similarity analysis completed (hashes, section counts, similarity score)
  • Merge strategy appropriate for similarity level (>80% auto-merge, 50-80% review, <50% manual)
  • Unique sections from both files included in merge
  • Identical sections deduplicated (only one copy kept)
  • Different sections resolved per strategy (smart/prefer_a/prefer_b/longer)
  • Metadata reconciliation applied (highest version, most recent date, preferred status)
  • Merge metadata footer added to output documenting sources
  • Output file written to correct path
  • Original files preserved (not deleted) until verification

Failure Indicators

This skill has FAILED if:

  • ❌ Similarity analysis returns 0% for clearly related documents (hash mismatch)
  • ❌ Merge strategy "smart" used without LLM available (ANTHROPIC_API_KEY not set)
  • ❌ Unique sections from one file lost in merge
  • ❌ Identical sections duplicated instead of deduplicated
  • ❌ Metadata not reconciled (e.g., older version number selected)
  • ❌ Output file missing merge metadata footer
  • ❌ Binary file passed to smart-merge (text-only tool)
  • ❌ Original files deleted before merge verification

When NOT to Use

Do NOT use this skill when:

  • Files are completely different content (similarity <30%)
  • Binary files (images, PDFs, compiled code)
  • Files with same name but unrelated purpose
  • Structural merge needed (code with git conflicts)
  • Real-time collaborative editing (use operational transforms)

Use alternatives instead:

  • For unrelated files → Keep separate, do not merge
  • For binary files → Use binary diff tools or manual review
  • For git conflicts → Use git merge with conflict resolution
  • For code merging → Use semantic merge tools (e.g., SemanticMerge)
  • For collaborative editing → Use CRDTs or operational transforms

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Auto-merging low similarity (<50%)Loses context, creates confusionReview merge output or keep separate
Using --llm without API keyFails with error, blocks workflowSet ANTHROPIC_API_KEY or use "longer" strategy
Deleting originals before verificationCan't recover if merge incorrectVerify merged output first
Ignoring merge recommendationMay merge unrelated filesFollow similarity-based recommendations
No section heading structureSmart-merge can't extract sectionsEnsure markdown files use proper headings
Batch merging without reviewCascading errors across filesReview each merge, especially <80% similarity

Principles

This skill embodies CODITECT automation principles:

  • #1 Recycle → Extend → Re-Use → Create - Reuses existing document sections instead of duplicating content
  • #4 Keep It Simple - Provides clear merge strategies (smart/prefer_a/prefer_b/longer)
  • #5 Eliminate Ambiguity - Similarity score and recommendation eliminate guesswork
  • #6 Clear, Understandable, Explainable - Merge metadata footer documents exactly what was merged and how
  • #8 No Assumptions - Analyzes similarity first, recommends strategy, requires verification
  • #9 Research When in Doubt - LLM integration for "smart" strategy resolves ambiguous conflicts

Full Standard: CODITECT-STANDARD-AUTOMATION.md


  • Script: scripts/smart-merge.py
  • Agent: agents/document-merger.md
  • Command: commands/smart-merge.md
  • Hook: hooks/smart-merge.md

Author: CODITECT Core Team Created: 2025-12-11