Sidecar Metadata Files vs Database
Both patterns are used; regulators care that metadata is preserved immutably and consistently, not where it lives.
Comparison Table
| Aspect | Sidecar/YAML-only | Central DB (Postgres) |
|---|---|---|
| Source of truth | Metadata travels with content in Git/object store | Clear, queryable system of record |
| Human review | Easy to view/edit in editors; good for dev workflows | Requires UI/tooling; less diff-friendly |
| Query & analytics | Needs indexing step; no ad-hoc SQL | Natural for dashboards, joins, reports |
| Immutability (WORM) | Git history can act as log; but Git alone may not satisfy WORM | DB must be backed by WORM/append-only logs |
| Complexity | Simpler infra; more complex analytics | More infra; simpler analytics |
| Regulatory | Must prove metadata wasn't altered independently | Stronger central controls, validations |
Schema Reference
Data Structure
field_name:
type: string
required: true
description: Field description
example: "example_value"
API Reference
Endpoint Overview
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/resource | List resources |
| POST | /api/v1/resource | Create resource |
| PUT | /api/v1/resource/:id | Update resource |
| DELETE | /api/v1/resource/:id | Delete resource |
Sidecar Pattern (Frontmatter/YAML)
Structure
documents/
├── policy-001.md
├── policy-001.yaml # Optional sidecar
└── policy-002.md
Or embedded:
# policy-001.md
---
doc_id: "uuid"
title: "HIPAA Privacy Policy"
retention_category: "HIPAA-6Y"
# ... all metadata in frontmatter
---
# Policy Content
...
Pros
- Metadata travels with content
- Version controlled in Git
- Easy for developers to edit
- No additional infrastructure
Cons
- No ad-hoc SQL queries
- Requires parsing for analytics
- Must build indexing pipeline
- WORM compliance harder to prove
Database Pattern (PostgreSQL)
Structure
documents (
doc_id UUID PRIMARY KEY,
path TEXT,
content_hash TEXT,
...
)
document_metadata (
doc_id UUID PRIMARY KEY,
title TEXT,
retention_category TEXT,
...
)
Pros
- Rich querying with SQL
- Easy compliance dashboards
- Central validation rules
- Natural for RBAC/ABAC
- Audit logging built-in
Cons
- Additional infrastructure
- Sync issues with content
- Developers need tooling
- Must archive DB for WORM
Hybrid Approach (Recommended)
For HIPAA/FINRA-grade systems, combine both:
Authoring Layer
- Metadata in frontmatter/sidecar
- Tightly coupled with Markdown
- Developer-friendly workflow
Operational Layer
- Parsed metadata normalized into PostgreSQL
- Rich querying and analytics
- RBAC/ABAC enforcement
Archival Layer
- Both content and metadata archived to WORM
- Preserved together for compliance
- Immutable snapshots
Implementation Flow
┌─────────────────────────────────┐
│ Markdown + Frontmatter │
│ (Git Repository) │
└───────────────┬─────────────────┘
│ Parse & Validate
▼
┌─────────────────────────────────┐
│ PostgreSQL │
│ - Normalized metadata │
│ - RBAC/ABAC attributes │
│ - Retention tracking │
└───────────────┬─────────────────┘
│ Archive
▼
┌─────────────────────────────────┐
│ WORM Storage │
│ - Content + metadata blob │
│ - Object lock retention │
│ - Tamper-evident │
└─────────────────────────────────┘
Sync Strategy
On Markdown Change
- Parse frontmatter → validate
- Upsert metadata row(s) in PostgreSQL
- Trigger reindex in search engine
- Archive to WORM if approved
Integrity Verification
- Content hash in DB matches file
- Periodic reconciliation jobs
- Alert on mismatches
Regulatory Compliance
For Sidecar-Only
- Archive entire repo to WORM snapshots
- Prove metadata wasn't altered independently
- Cryptographic signing of commits
For Database
- DB backed by WORM/append-only logs
- Audit trail in separate WORM store
- Periodic DB snapshots to WORM
For Hybrid (Best)
- Frontmatter for authoring
- DB for operations
- Both archived to WORM together
- Clear chain of custody