CODITECT Audio2Text - Project Structure
Overview
This document describes the complete structure of the CODITECT Audio2Text project.
Directory Tree
coditect-audio2text/
│
├── backend/ # Backend API Service (FastAPI/Python)
│ ├── src/
│ │ ├── __init__.py
│ │ ├── main.py # Application entry point
│ │ ├── api/ # API layer
│ │ │ ├── __init__.py
│ │ │ └── routes/ # API route definitions
│ │ │ ├── __init__.py
│ │ │ ├── transcription.py # Transcription endpoints
│ │ │ ├── batch.py # Batch processing endpoints
│ │ │ └── health.py # Health check endpoints
│ │ ├── config/ # Configuration management
│ │ │ ├── __init__.py
│ │ │ ├── settings.py # Application settings
│ │ │ └── logging_config.py # Logging configuration
│ │ ├── models/ # Data models and schemas
│ │ │ ├── __init__.py
│ │ │ └── schemas.py # Pydantic models
│ │ ├── services/ # Business logic services
│ │ │ ├── __init__.py
│ │ │ ├── download_service.py # YouTube download logic
│ │ │ ├── transcription_service.py # Whisper transcription
│ │ │ ├── processing_service.py # Audio processing
│ │ │ └── job_manager.py # Job queue management
│ │ └── utils/ # Utility functions
│ │ ├── __init__.py
│ │ ├── validators.py # Input validation
│ │ ├── file_utils.py # File operations
│ │ └── exceptions.py # Custom exceptions
│ ├── tests/ # Backend tests
│ │ ├── unit/ # Unit tests
│ │ └── integration/ # Integration tests
│ └── requirements.txt # Python dependencies
│
├── frontend/ # Frontend Web Application (React)
│ ├── src/
│ │ ├── main.tsx # Application entry
│ │ ├── app.tsx # Main App component
│ │ ├── index.css # Global styles
│ │ ├── components/ # Reusable UI components
│ │ │ └── navigation.tsx # Navigation component
│ │ ├── pages/ # Page components
│ │ │ ├── home.tsx # Home page
│ │ │ ├── jobs.tsx # Jobs listing page
│ │ │ └── about.tsx # About page
│ │ ├── services/ # API client services
│ │ │ ├── api.ts # API client
│ │ │ └── websocket.ts # WebSocket client
│ │ ├── styles/ # Styling and themes
│ │ │ └── theme.ts # MUI theme configuration
│ │ └── utils/ # Frontend utilities
│ │ └── helpers.ts # Helper functions
│ ├── public/ # Static assets
│ ├── tests/ # Frontend tests
│ ├── package.json # Node dependencies
│ ├── vite.config.ts # Vite configuration
│ ├── tsconfig.json # TypeScript configuration
│ └── index.html # HTML template
│
├── core/ # Core Processing Library
│ ├── src/
│ │ ├── __init__.py
│ │ ├── download/ # Download functionality
│ │ │ ├── __init__.py
│ │ │ └── ytdlp_wrapper.py # yt-dlp wrapper
│ │ ├── transcription/ # Transcription functionality
│ │ │ ├── __init__.py
│ │ │ ├── whisper_wrapper.py # Whisper wrapper
│ │ │ └── model_manager.py # Model management
│ │ ├── processing/ # Audio processing
│ │ │ ├── __init__.py
│ │ │ ├── audio_converter.py # Audio format conversion
│ │ │ └── format_handlers.py # Output format handlers
│ │ └── shared/ # Shared utilities
│ │ ├── __init__.py
│ │ ├── types.py # Type definitions
│ │ └── utils.py # Common utilities
│ ├── tests/ # Core library tests
│ ├── setup.py # Package setup
│ └── README.md # Core library documentation
│
├── docs/ # Documentation
│ ├── specs/ # Technical specifications
│ │ ├── sdd.md # System Design Document
│ │ └── tdd.md # Technical Design Document
│ ├── architecture/ # Architecture documentation
│ │ └── adr-001-model-selection.md # Architecture Decision Records
│ ├── api/ # API documentation
│ │ └── README.md # API guide (to be created)
│ └── guides/ # User guides
│ ├── user-guide.md # End-user guide
│ └── developer-guide.md # Developer guide
│
├── config/ # Shared configuration files
│ └── (configuration files)
│
├── data/ # Data directories (git-ignored)
│ ├── input/ # Input files
│ │ └── .gitkeep
│ ├── output/ # Transcription outputs
│ │ └── .gitkeep
│ ├── cache/ # Temporary cache
│ │ └── .gitkeep
│ └── models/ # Whisper model storage
│ └── .gitkeep
│
├── scripts/ # Utility scripts
│ └── (CLI scripts, automation)
│
├── .github/ # GitHub configuration
│ └── workflows/ # GitHub Actions workflows
│
├── .env.example # Environment variables template
├── .gitignore # Git ignore rules
├── docker-compose.yml # Docker Compose configuration
├── Dockerfile # Docker image definition
├── LICENSE # Proprietary license
├── Makefile # Build automation
├── project-structure.md # This file
└── README.md # Project overview
Key Files Description
Root Level
- README.md: Project overview, features, and quick start guide
- LICENSE: Proprietary software license (AZ1.AI INC.)
- Makefile: Build automation and development commands
- Dockerfile: Container image for backend service
- docker-compose.yml: Multi-container orchestration
- .env.example: Environment variables template
- .gitignore: Files and directories to exclude from git
Backend (/backend)
Main Application:
src/main.py: FastAPI application initialization and routing
API Layer:
api/routes/transcription.py: Transcription job endpointsapi/routes/batch.py: Batch processing endpointsapi/routes/health.py: Health check and monitoring
Configuration:
config/settings.py: Application settings using Pydanticconfig/logging_config.py: Logging setup and configuration
Data Models:
models/schemas.py: Request/response models and validation
Services:
services/download_service.py: YouTube audio downloadservices/transcription_service.py: Whisper integrationservices/processing_service.py: Audio processing pipelineservices/job_manager.py: Job queue and status tracking
Dependencies:
requirements.txt: Python package dependencies
Frontend (/frontend)
Application:
src/main.tsx: React application entry pointsrc/app.tsx: Main application component with routing
Pages:
pages/home.tsx: Transcription submission interfacepages/jobs.tsx: Job monitoring and managementpages/about.tsx: Application information
Components:
components/navigation.tsx: Main navigation bar
Services:
services/api.ts: REST API clientservices/websocket.ts: WebSocket for real-time updates
Configuration:
vite.config.ts: Vite build configurationtsconfig.json: TypeScript compiler optionspackage.json: npm dependencies and scripts
Core Library (/core)
Download Module:
download/ytdlp_wrapper.py: yt-dlp integration
Transcription Module:
transcription/whisper_wrapper.py: OpenAI Whisper wrappertranscription/model_manager.py: Model caching and loading
Processing Module:
processing/audio_converter.py: FFmpeg integrationprocessing/format_handlers.py: Output format generation
Shared:
shared/types.py: Common type definitionsshared/utils.py: Utility functions
Documentation (/docs)
Specifications:
specs/sdd.md: System architecture and designspecs/tdd.md: Technical implementation details
Architecture:
architecture/adr-001-model-selection.md: Model selection rationale
Guides:
guides/user-guide.md: End-user documentationguides/developer-guide.md: Development setup and practices
Data Flow
- User Input → Frontend (React)
- API Request → Backend (FastAPI)
- Job Creation → Job Manager
- Download → Download Service → yt-dlp → Audio File
- Process → Processing Service → FFmpeg → Normalized Audio
- Transcribe → Transcription Service → Whisper → Text Output
- Format → Format Handlers → Multiple Output Files
- Response → Frontend → User Download
Development Workflow
Backend Development
cd backend
pip install -r requirements.txt
uvicorn src.main:app --reload
Frontend Development
cd frontend
npm install
npm run dev
Core Library Development
cd core
pip install -e .
pytest tests/
Configuration
Environment Variables
Configure in .env file (copy from .env.example):
- Application settings (DEBUG, PORT, etc.)
- Whisper model configuration
- Processing limits
- Cache settings
Model Storage
Whisper models are downloaded to data/models/ on first use.
Output Files
Transcription results are saved to data/output/.
Testing
Backend Tests
cd backend
pytest tests/unit/
pytest tests/integration/
Frontend Tests
cd frontend
npm test
Deployment
Docker
docker-compose build
docker-compose up -d
Production
- See
docs/guides/developer-guide.mdfor deployment instructions
Copyright
Copyright (c) 2025 AZ1.AI INC. All Rights Reserved.
Developer: Hal Casteel, CEO/CTO Email: 1@az1.ai
Last Updated: 2025-11-07