Skip to main content

Batch Classification Runner

High-performance batch processing for classifying large document sets. Features:

  • Resumable processing with checkpoint files
  • Parallel document processing
  • Detailed progress reporting
  • Multiple output formats
  • Statistics and analytics

Usage: # Classify all docs in repository python batch_classify.py

# Resume from checkpoint
python batch_classify.py --resume

# Process specific directory with workers
python batch_classify.py --path docs/ --workers 4

# Generate detailed report
python batch_classify.py --report

File: batch_classify.py

Classes

BatchState

State for resumable batch processing.

BatchStats

Statistics from batch classification.

BatchRunner

High-performance batch classification runner.

Functions

create_parser()

Create argument parser.

main()

Main entry point.

to_dict()

No description

from_dict(cls, data)

No description

run()

Execute batch classification.

Usage

python batch_classify.py