C4 Model - Level 3: Component Level

Component Diagram: Deduplication Engine

Component Diagram: Artifact Generator

Component Descriptions

Deduplication Engine Components

Frame Loader

Responsibility: Load frame files in batches to manage memory
Interface: load_batch(paths: List[Path]) -> List[Image]
Key Feature: Streaming loader for large video files

Perceptual Hasher

Responsibility: Compute 64-bit perceptual hash for each frame
Algorithm: pHash (discrete cosine transform based)
Interface: compute_hash(image: Image) -> ImageHash

Hash Comparer

Responsibility: Calculate Hamming distance between hashes
Logic: Fast XOR + bit count operation
Interface: distance(hash1: ImageHash, hash2: ImageHash) -> int

Duplicate Filter

Responsibility: Apply threshold to decide uniqueness
Configuration: threshold: int (default 10)
Logic: distance > threshold ? unique : duplicate

Unique Frame Writer

Responsibility: Tag and write unique frames
Metadata: Sets is_unique=true, stores similarity_score

Artifact Generator Components

Template Router

Responsibility: Select appropriate template based on artifact type
Logic: Maps artifact_type enum to template file

Jinja2 Templates

Responsibility: Define structure and prompts for each artifact
Types:
- SDD: System design sections
- TDD: Technical implementation details
- ADR: Decision context, decision, consequences
- C4: C1-C3 diagram definitions
- Summary: Executive overview
- Glossary: Term definitions

LLM Renderers

Responsibility: Execute template through Claude API
Prompt Engineering: Structured prompts with examples
Output Parsing: Extract markdown from response

Post Processor

Responsibility: Validate and clean generated markdown
Checks:
- Mermaid syntax validation
- Markdown link integrity
- Header hierarchy

Front Matter Injector

Responsibility: Add YAML metadata header
Fields: id, type, title, version, date, parent_id

Component Interaction Sequences

Deduplication Sequence

Artifact Generation Sequence

Code-Level Design (C4 Level 3.5)

DeduplicationEngine Class

class DeduplicationEngine:
    """
    Removes perceptually similar video frames.
    """
    
    def __init__(
        self,
        threshold: int = 10,
        hash_size: int = 8,  # 8x8 = 64-bit hash
        batch_size: int = 100
    ):
        self.threshold = threshold
        self.hash_size = hash_size
        self.batch_size = batch_size
        self._last_hash: Optional[ImageHash] = None
    
    def process_frames(
        self,
        frame_paths: List[Path]
    ) -> Iterator[FrameResult]:
        """
        Process frames and yield results with deduplication info.
        
        Yields FrameResult with fields:
        - path: Path to frame
        - hash: Perceptual hash
        - is_unique: Boolean
        - similarity_score: Float 0-1
        """
        for batch in self._batches(frame_paths):
            for path in batch:
                result = self._process_single(path)
                yield result
                
                if result.is_unique:
                    self._last_hash = result.hash
    
    def _process_single(self, path: Path) -> FrameResult:
        image = Image.open(path)
        current_hash = imagehash.phash(image, hash_size=self.hash_size)
        
        if self._last_hash is None:
            return FrameResult(
                path=path,
                hash=current_hash,
                is_unique=True,
                similarity_score=1.0
            )
        
        distance = current_hash - self._last_hash
        is_unique = distance > self.threshold
        similarity = 1.0 - (distance / (self.hash_size ** 2 * 4))
        
        return FrameResult(
            path=path,
            hash=current_hash,
            is_unique=is_unique,
            similarity_score=max(0.0, similarity)
        )

ArtifactGenerator Class

class ArtifactGenerator:
    """
    Generates structured documentation artifacts from content.
    """
    
    TEMPLATE_DIR = Path("templates/")
    
    def __init__(self, llm_client: AnthropicClient):
        self.llm = llm_client
        self.env = jinja2.Environment(
            loader=jinja2.FileSystemLoader(self.TEMPLATE_DIR)
        )
    
    async def generate(
        self,
        artifact_type: ArtifactType,
        content: SynthesizedContent,
        context: GenerationContext
    ) -> Artifact:
        # Load and render template
        template = self.env.get_template(f"{artifact_type.value}.j2")
        prompt = template.render(
            content=content,
            context=context
        )
        
        # Generate via LLM
        response = await self.llm.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=8192,
            messages=[{"role": "user", "content": prompt}]
        )
        
        raw_content = response.content[0].text
        
        # Post-process
        cleaned = self._validate_markdown(raw_content)
        
        # Add front matter
        final = self._inject_front_matter(
            cleaned,
            artifact_type=artifact_type,
            content_id=content.id
        )
        
        return Artifact(
            type=artifact_type,
            content=final,
            metadata=self._extract_metadata(final)
        )

Component Diagram: Deduplication Engine​

Component Diagram: Artifact Generator​

Component Descriptions​

Deduplication Engine Components​

Frame Loader​

Perceptual Hasher​

Hash Comparer​

Duplicate Filter​

Unique Frame Writer​

Artifact Generator Components​

Template Router​

Jinja2 Templates​

LLM Renderers​

Post Processor​

Front Matter Injector​

Component Interaction Sequences​

Deduplication Sequence​

Artifact Generation Sequence​

Code-Level Design (C4 Level 3.5)​

DeduplicationEngine Class​

ArtifactGenerator Class​

Component Diagram: Deduplication Engine

Component Diagram: Artifact Generator

Component Descriptions

Deduplication Engine Components

Frame Loader

Perceptual Hasher

Hash Comparer

Duplicate Filter

Unique Frame Writer

Artifact Generator Components

Template Router

Jinja2 Templates

LLM Renderers

Post Processor

Front Matter Injector

Component Interaction Sequences

Deduplication Sequence

Artifact Generation Sequence

Code-Level Design (C4 Level 3.5)

DeduplicationEngine Class

ArtifactGenerator Class