Skip to main content

Agent Skills Framework Extension

Document Merging Skill

When to Use This Skill

Use this skill when implementing document merging patterns in your codebase.

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Intelligent document merging with conflict detection, semantic analysis, and automated resolution.

Core Capabilities

  1. Conflict Detection - Identify merge conflicts
  2. Semantic Merging - Content-aware merging
  3. Diff Resolution - Resolve differences intelligently
  4. Content Harmonization - Unify style and structure
  5. Version Reconciliation - Merge multiple versions
  6. Automated Strategies - Smart merge decision making

Document Merger

scripts/document-merger.py

from dataclasses import dataclass from typing import List, Dict, Optional, Tuple from enum import Enum import difflib import re

class ConflictType(Enum): CONTENT = "content" STRUCTURE = "structure" FORMATTING = "formatting" SEMANTIC = "semantic"

@dataclass class Conflict: type: ConflictType location: str version_a: str version_b: str suggested_resolution: Optional[str] confidence: float

@dataclass class MergeResult: merged_content: str conflicts: List[Conflict] auto_resolved: int manual_review_needed: int merge_strategy_used: str

class DocumentMerger: """Intelligently merge documents"""

def merge(
self,
version_a: str,
version_b: str,
base: Optional[str] = None,
strategy: str = 'semantic'
) -> MergeResult:
"""Merge two document versions"""

# Detect conflicts
conflicts = self._detect_conflicts(version_a, version_b, base)

# Auto-resolve conflicts
auto_resolved = []
manual_conflicts = []

for conflict in conflicts:
resolution = self._auto_resolve(conflict, strategy)
if resolution:
auto_resolved.append((conflict, resolution))
else:
manual_conflicts.append(conflict)

# Build merged content
merged = self._build_merged(
version_a,
version_b,
auto_resolved,
manual_conflicts
)

return MergeResult(
merged_content=merged,
conflicts=conflicts,
auto_resolved=len(auto_resolved),
manual_review_needed=len(manual_conflicts),
merge_strategy_used=strategy
)

def _detect_conflicts(
self,
version_a: str,
version_b: str,
base: Optional[str]
) -> List[Conflict]:
"""Detect all conflicts between versions"""
conflicts = []

# Split into lines for comparison
lines_a = version_a.split('\n')
lines_b = version_b.split('\n')

# Use difflib to find differences
diff = difflib.unified_diff(lines_a, lines_b, lineterm='')

current_conflict = None
version_a_lines = []
version_b_lines = []

for line in diff:
if line.startswith('---') or line.startswith('+++'):
continue
elif line.startswith('@@'):
# Location marker
if current_conflict:
conflicts.append(current_conflict)
current_conflict = None

location = line
version_a_lines = []
version_b_lines = []

elif line.startswith('-'):
version_a_lines.append(line[1:])
elif line.startswith('+'):
version_b_lines.append(line[1:])
elif line.startswith(' '):
# Context line - create conflict if we have differences
if version_a_lines or version_b_lines:
conflict_type = self._classify_conflict(
'\n'.join(version_a_lines),
'\n'.join(version_b_lines)
)

current_conflict = Conflict(
type=conflict_type,
location=location if location else "unknown",
version_a='\n'.join(version_a_lines),
version_b='\n'.join(version_b_lines),
suggested_resolution=None,
confidence=0.0
)

conflicts.append(current_conflict)
version_a_lines = []
version_b_lines = []

return conflicts

def _classify_conflict(self, version_a: str, version_b: str) -> ConflictType:
"""Classify type of conflict"""
# Content conflict if substantially different
similarity = difflib.SequenceMatcher(None, version_a, version_b).ratio()

if similarity < 0.3:
return ConflictType.CONTENT

# Structure conflict if headings differ
if self._is_heading(version_a) != self._is_heading(version_b):
return ConflictType.STRUCTURE

# Formatting conflict if only whitespace/punctuation differs
if version_a.strip() == version_b.strip():
return ConflictType.FORMATTING

# Otherwise semantic
return ConflictType.SEMANTIC

def _is_heading(self, text: str) -> bool:
"""Check if text is a heading"""
return bool(re.match(r'^#{1,6}\s+', text))

def _auto_resolve(
self,
conflict: Conflict,
strategy: str
) -> Optional[str]:
"""Attempt to auto-resolve conflict"""

# Formatting conflicts: keep version B (newer)
if conflict.type == ConflictType.FORMATTING:
conflict.suggested_resolution = conflict.version_b
conflict.confidence = 0.9
return conflict.version_b

# Content conflicts with high similarity: merge
if conflict.type == ConflictType.CONTENT:
similarity = difflib.SequenceMatcher(
None,
conflict.version_a,
conflict.version_b
).ratio()

if similarity > 0.8:
# Take longer version (more detailed)
if len(conflict.version_b) > len(conflict.version_a):
conflict.suggested_resolution = conflict.version_b
else:
conflict.suggested_resolution = conflict.version_a
conflict.confidence = 0.7
return conflict.suggested_resolution

# Semantic conflicts: try to combine
if conflict.type == ConflictType.SEMANTIC:
combined = self._combine_semantic(
conflict.version_a,
conflict.version_b
)
if combined:
conflict.suggested_resolution = combined
conflict.confidence = 0.6
return combined

return None # Cannot auto-resolve

def _combine_semantic(self, version_a: str, version_b: str) -> Optional[str]:
"""Combine semantically similar content"""
# Simple strategy: concatenate unique sentences
sentences_a = set(version_a.split('.'))
sentences_b = set(version_b.split('.'))

all_sentences = sentences_a.union(sentences_b)
combined = '. '.join(s.strip() for s in all_sentences if s.strip())

return combined if combined else None

def _build_merged(
self,
version_a: str,
version_b: str,
auto_resolved: List[Tuple[Conflict, str]],
manual_conflicts: List[Conflict]
) -> str:
"""Build merged document"""
# Start with version B as base
merged = version_b

# Apply auto-resolutions
for conflict, resolution in auto_resolved:
# Replace conflict in merged content
if conflict.version_b in merged:
merged = merged.replace(conflict.version_b, resolution, 1)

# Add markers for manual conflicts
for conflict in manual_conflicts:
marker = f"""

<<<<<<< Version A {conflict.version_a}

{conflict.version_b}

Version B """ if conflict.version_b in merged: merged = merged.replace(conflict.version_b, marker, 1)

    return merged

Usage

merger = DocumentMerger()

doc_a = """# API Documentation

Authentication

Use JWT tokens for authentication.

Endpoints

GET /api/users - List users """

doc_b = """# API Documentation

Authentication

Use OAuth2 for authentication.

Endpoints

GET /api/users - Retrieve all users POST /api/users - Create new user """

result = merger.merge(doc_a, doc_b, strategy='semantic')

print(f"Auto-resolved: {result.auto_resolved}") print(f"Manual review needed: {result.manual_review_needed}") print("\nMerged content:") print(result.merged_content)

Conflict Resolver

scripts/conflict-resolver.py

from dataclasses import dataclass from typing import List, Optional import re

@dataclass class Resolution: strategy: str result: str confidence: float explanation: str

class ConflictResolver: """Resolve merge conflicts intelligently"""

STRATEGIES = [
'take_both',
'take_longer',
'take_newer',
'combine',
'prefer_structured'
]

def resolve(self, conflict: Conflict, context: Optional[str] = None) -> Resolution:
"""Resolve conflict using best strategy"""
# Try each strategy and score
resolutions = []

for strategy in self.STRATEGIES:
result = self._apply_strategy(conflict, strategy, context)
if result:
resolutions.append(result)

# Return best resolution
if resolutions:
return max(resolutions, key=lambda r: r.confidence)
else:
return Resolution(
strategy='manual',
result=f"CONFLICT: Choose between A or B",
confidence=0.0,
explanation="No automatic resolution possible"
)

def _apply_strategy(
self,
conflict: Conflict,
strategy: str,
context: Optional[str]
) -> Optional[Resolution]:
"""Apply specific resolution strategy"""

if strategy == 'take_both':
# Combine both versions
result = f"{conflict.version_a}\n\n{conflict.version_b}"
return Resolution(
strategy='take_both',
result=result,
confidence=0.6,
explanation="Combined both versions"
)

elif strategy == 'take_longer':
# Take longer (more detailed) version
if len(conflict.version_b) > len(conflict.version_a):
result = conflict.version_b
confidence = 0.7
else:
result = conflict.version_a
confidence = 0.7

return Resolution(
strategy='take_longer',
result=result,
confidence=confidence,
explanation="Selected longer version"
)

elif strategy == 'combine':
# Intelligent combination
combined = self._intelligent_combine(conflict)
if combined:
return Resolution(
strategy='combine',
result=combined,
confidence=0.8,
explanation="Intelligently combined content"
)

elif strategy == 'prefer_structured':
# Prefer version with better structure
score_a = self._structure_score(conflict.version_a)
score_b = self._structure_score(conflict.version_b)

if score_b > score_a:
return Resolution(
strategy='prefer_structured',
result=conflict.version_b,
confidence=0.75,
explanation="Preferred better-structured version"
)
else:
return Resolution(
strategy='prefer_structured',
result=conflict.version_a,
confidence=0.75,
explanation="Preferred better-structured version"
)

return None

def _intelligent_combine(self, conflict: Conflict) -> Optional[str]:
"""Intelligently combine conflicting content"""
# Extract unique bullet points
bullets_a = re.findall(r'^[-*]\s+(.+)$', conflict.version_a, re.MULTILINE)
bullets_b = re.findall(r'^[-*]\s+(.+)$', conflict.version_b, re.MULTILINE)

if bullets_a or bullets_b:
# Combine unique bullets
all_bullets = list(set(bullets_a + bullets_b))
return '\n'.join(f"- {b}" for b in all_bullets)

# Extract paragraphs
paras_a = conflict.version_a.split('\n\n')
paras_b = conflict.version_b.split('\n\n')

# Take unique paragraphs
all_paras = []
seen = set()
for para in paras_a + paras_b:
if para.strip() and para.strip() not in seen:
all_paras.append(para)
seen.add(para.strip())

if all_paras:
return '\n\n'.join(all_paras)

return None

def _structure_score(self, text: str) -> float:
"""Score structural quality"""
score = 0.0

# Has headings
if re.search(r'^#{1,6}\s+', text, re.MULTILINE):
score += 0.3

# Has lists
if re.search(r'^[-*]\s+', text, re.MULTILINE):
score += 0.2

# Has paragraphs
if '\n\n' in text:
score += 0.2

# Good length
if 50 < len(text) < 500:
score += 0.3

return score

Usage

resolver = ConflictResolver()

conflict = Conflict( type=ConflictType.CONTENT, location="line 10", version_a="Use JWT authentication", version_b="Use OAuth2 authentication with JWT tokens", suggested_resolution=None, confidence=0.0 )

resolution = resolver.resolve(conflict) print(f"Strategy: {resolution.strategy}") print(f"Result: {resolution.result}") print(f"Confidence: {resolution.confidence:.2f}")

Content Harmonizer

// scripts/content-harmonizer.ts
interface StyleGuide {
headingStyle: 'atx' | 'setext'; // # vs underline
listStyle: '-' | '*' | '+';
codeBlockStyle: 'fenced' | 'indented';
lineLength: number;
}

class ContentHarmonizer {
/**
* Harmonize content style and structure
*/
harmonize(content: string, styleGuide: StyleGuide): string {
let harmonized = content;

// Normalize headings
harmonized = this.normalizeHeadings(harmonized, styleGuide.headingStyle);

// Normalize lists
harmonized = this.normalizeLists(harmonized, styleGuide.listStyle);

// Normalize code blocks
harmonized = this.normalizeCodeBlocks(harmonized, styleGuide.codeBlockStyle);

// Wrap long lines
harmonized = this.wrapLines(harmonized, styleGuide.lineLength);

return harmonized;
}

private normalizeHeadings(content: string, style: 'atx' | 'setext'): string {
if (style === 'atx') {
// Convert setext to atx (# style)
content = content.replace(/^(.+)\n=+$/gm, '# $1');
content = content.replace(/^(.+)\n-+$/gm, '## $1');
}

return content;
}

private normalizeLists(content: string, marker: string): string {
// Convert all list markers to preferred style
return content.replace(/^[*+-]\s+/gm, `${marker} `);
}

private normalizeCodeBlocks(content: string, style: 'fenced' | 'indented'): string {
if (style === 'fenced') {
// Ensure fenced code blocks
return content.replace(/^( .+)$/gm, '```\n$1\n```');
}

return content;
}

private wrapLines(content: string, maxLength: number): string {
const lines = content.split('\n');
const wrapped: string[] = [];

for (const line of lines) {
if (line.length <= maxLength || line.startsWith('#') || line.startsWith('```')) {
wrapped.push(line);
} else {
// Wrap at word boundaries
const words = line.split(' ');
let currentLine = '';

for (const word of words) {
if ((currentLine + word).length > maxLength) {
wrapped.push(currentLine.trim());
currentLine = word + ' ';
} else {
currentLine += word + ' ';
}
}

if (currentLine) {
wrapped.push(currentLine.trim());
}
}
}

return wrapped.join('\n');
}
}

// Usage
const harmonizer = new ContentHarmonizer();

const content = `
Some heading
============

* List item 1
+ List item 2
- List item 3

code example
more code
`;

const styleGuide: StyleGuide = {
headingStyle: 'atx',
listStyle: '-',
codeBlockStyle: 'fenced',
lineLength: 100
};

const harmonized = harmonizer.harmonize(content, styleGuide);
console.log(harmonized);
## Usage Examples

### Document Merging

Apply document-merging skill to merge two versions of README with conflict detection


### Conflict Resolution

Apply document-merging skill to auto-resolve formatting and semantic conflicts


### Content Harmonization

Apply document-merging skill to harmonize markdown style across merged documents

Integration Points

  • thoughts-analysis-patterns - Semantic analysis
  • session-analysis-patterns - Version tracking
  • research-patterns - Content validation

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: document-merging

Completed:
- [x] Conflicts detected and classified
- [x] Auto-resolution attempted (X/Y conflicts resolved)
- [x] Manual conflicts marked with clear boundaries
- [x] Content harmonized to style guide
- [x] Merged document validated

Outputs:
- Merged document with conflicts resolved
- Conflict report showing resolution strategy
- Style harmonization applied
- Auto-resolved: X conflicts
- Manual review needed: Y conflicts

Completion Checklist

Before marking this skill as complete, verify:

  • Both document versions loaded successfully
  • Conflict detection completed (all differences found)
  • Auto-resolution attempted with confidence scoring
  • Unresolved conflicts marked with clear delimiters
  • Content harmonization applied (headings, lists, formatting)
  • Merged output validates (no syntax errors)
  • Conflict report generated with resolution strategies
  • Manual review items clearly identified

Failure Indicators

This skill has FAILED if:

  • ❌ Document versions could not be parsed
  • ❌ Conflict detection produced no results when differences exist
  • ❌ Auto-resolution created invalid merged content
  • ❌ Manual conflict markers are malformed or missing
  • ❌ Content harmonization broke document structure
  • ❌ Merged output has syntax errors or broken links
  • ❌ No conflict report generated

When NOT to Use

Do NOT use this skill when:

  • Single document with no merge required (use standard editing instead)
  • Documents are in different formats (convert first)
  • Binary file merging (use specialized binary merge tools)
  • Simple text append operations (use concatenation)
  • Version control system conflicts (use git-merge-conflict-resolution skill)
  • Documents have incompatible schemas (reconcile schemas first)
  • Auto-merge without review is acceptable (use simpler merge)

Use alternative skills:

  • git-merge-conflict-resolution - For git conflicts
  • schema-migration-patterns - For schema conflicts
  • content-transformation-patterns - For format conversion

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Auto-resolving all conflictsHigh-risk decisions made without reviewOnly auto-resolve low-confidence conflicts (formatting, whitespace)
Ignoring semantic meaningContent-based conflicts treated as text diffsUse semantic analysis to understand content intent
No conflict markersManual review impossibleAlways mark unresolved conflicts with clear delimiters
Applying wrong style guideInconsistent merged outputDetect or specify style guide before harmonization
Merging without base versionThree-way merge more accurateUse base version when available for better conflict detection
No validation after mergeBroken output goes undetectedAlways validate merged content (syntax, links, structure)
Skipping conflict reportNo audit trail of decisionsGenerate detailed report showing all resolutions

Principles

This skill embodies:

  • #5 Eliminate Ambiguity - Clear conflict markers and resolution explanations
  • #6 Clear, Understandable, Explainable - Explicit merge strategies and confidence scores
  • #8 No Assumptions - Ask for style guide, don't assume merge strategy
  • #11 Reliability - Validate merged output, detect errors early
  • Trust & Transparency - Show all conflicts, explain auto-resolution decisions

Full Standard: CODITECT-STANDARD-AUTOMATION.md


Version: 1.1.0 | Updated: 2026-01-04 | Author: CODITECT Team