Batch Classification Runner
High-performance batch processing for classifying large document sets. Features:
- Resumable processing with checkpoint files
- Parallel document processing
- Detailed progress reporting
- Multiple output formats
- Statistics and analytics
Usage: # Classify all docs in repository python batch_classify.py
# Resume from checkpoint
python batch_classify.py --resume
# Process specific directory with workers
python batch_classify.py --path docs/ --workers 4
# Generate detailed report
python batch_classify.py --report
File: batch_classify.py
Classes
BatchState
State for resumable batch processing.
BatchStats
Statistics from batch classification.
BatchRunner
High-performance batch classification runner.
Functions
create_parser()
Create argument parser.
main()
Main entry point.
to_dict()
No description
from_dict(cls, data)
No description
run()
Execute batch classification.
Usage
python batch_classify.py