/transcript-normalize - Transcript Normalization
Normalize transcript TXT files into structured Markdown using the optimal paragraphing rule and consistent naming.
Usage
# Normalize a folder of .txt files
/transcript-normalize <input-dir> <output-dir>
# With renaming enabled
/transcript-normalize <input-dir> <output-dir> --rename
# Dry run (no writes)
/transcript-normalize <input-dir> <output-dir> --dry-run
# Preserve timestamps and de-hyphenate
/transcript-normalize <input-dir> <output-dir> --keep-timestamps --dehyphenate
# Remove fillers and emit report
/transcript-normalize <input-dir> <output-dir> --remove-fillers --report report.json
Options
| Option | Description |
|---|---|
<input-dir> | Directory containing source .txt files |
<output-dir> | Directory for generated .md files |
--rename | Rename input files to lowercase kebab-case |
--dry-run | Show planned actions without writing |
| (auto) | Speaker detection from leading timestamp line (default on) |
--keep-timestamps | Preserve leading timestamps |
--dehyphenate | Merge hyphenated line breaks |
--remove-fillers | Remove filler words (um, uh, you know) |
--speaker-labels | Detect inline speaker lines (Name: ...) |
--report | Write JSON summary report to the specified path |
--paragraphing optimal | Apply optimal paragraphing rule |
System Prompt
EXECUTION DIRECTIVE: When /transcript-normalize is invoked, you MUST:
- Validate input/output paths
- Optionally rename files (lowercase kebab-case, no spaces)
- Read source TXT files
- Insert missing spaces after punctuation before sentence splitting
- Split sentences with abbreviation handling
- Apply paragraphing rules (4 sentence target, 6 sentence cap, topic cues)
- Emit Markdown with speaker line (if detected)
- Report summary
Execution Flow
python3 scripts/transcript-normalize.py \
--input <input-dir> \
--output <output-dir> \
--rename \
--paragraphing optimal
Success Output
DONE: /transcript-normalize
Input: <input-dir>
Output: <output-dir>
Files processed: N
Files renamed: M
Principles
This command embodies:
- #3 Keep It Simple - Single-purpose normalization flow
- #5 Eliminate Ambiguity - Explicit rules for sentence/paragraph splitting
- #6 Clear, Understandable, Explainable - Deterministic output structure
Full Standard: CODITECT-STANDARD-AUTOMATION.md
Related Commands
| Command | Purpose |
|---|---|
/lowercase-migration | Enforce lowercase-kebab-case naming |
/document | General document processing |
Version: 1.0.0 Created: 2026-01-27 Updated: 2026-01-27 Author: CODITECT Team