Deduplicate all tables in context.db - ensure unique entries.
Removes duplicate entries across all major tables:
- decisions (by decision text)
- code_patterns (by code content)
- doc_index (by file_path) - already has UNIQUE constraint
Usage: python3 scripts/deduplicate-all-tables.py # Dry run python3 scripts/deduplicate-all-tables.py --apply # Apply changes python3 scripts/deduplicate-all-tables.py --stats # Show statistics only
File: deduplicate-all-tables.py
Functions
get_all_stats(conn)
Get statistics for all major tables.
deduplicate_decisions(conn, dry_run)
Deduplicate decisions table by decision text.
deduplicate_code_patterns(conn, dry_run)
Deduplicate code_patterns table by code content.
add_unique_constraints(conn, dry_run)
Add UNIQUE constraints to prevent future duplicates.
vacuum_database(conn)
Vacuum the database to reclaim space.
main()
No description
Usage
python deduplicate-all-tables.py