Skip to main content

CODITECT Audio2Text - Project Structure

Overview

This document describes the complete structure of the CODITECT Audio2Text project.


Directory Tree

coditect-audio2text/

├── backend/ # Backend API Service (FastAPI/Python)
│ ├── src/
│ │ ├── __init__.py
│ │ ├── main.py # Application entry point
│ │ ├── api/ # API layer
│ │ │ ├── __init__.py
│ │ │ └── routes/ # API route definitions
│ │ │ ├── __init__.py
│ │ │ ├── transcription.py # Transcription endpoints
│ │ │ ├── batch.py # Batch processing endpoints
│ │ │ └── health.py # Health check endpoints
│ │ ├── config/ # Configuration management
│ │ │ ├── __init__.py
│ │ │ ├── settings.py # Application settings
│ │ │ └── logging_config.py # Logging configuration
│ │ ├── models/ # Data models and schemas
│ │ │ ├── __init__.py
│ │ │ └── schemas.py # Pydantic models
│ │ ├── services/ # Business logic services
│ │ │ ├── __init__.py
│ │ │ ├── download_service.py # YouTube download logic
│ │ │ ├── transcription_service.py # Whisper transcription
│ │ │ ├── processing_service.py # Audio processing
│ │ │ └── job_manager.py # Job queue management
│ │ └── utils/ # Utility functions
│ │ ├── __init__.py
│ │ ├── validators.py # Input validation
│ │ ├── file_utils.py # File operations
│ │ └── exceptions.py # Custom exceptions
│ ├── tests/ # Backend tests
│ │ ├── unit/ # Unit tests
│ │ └── integration/ # Integration tests
│ └── requirements.txt # Python dependencies

├── frontend/ # Frontend Web Application (React)
│ ├── src/
│ │ ├── main.tsx # Application entry
│ │ ├── app.tsx # Main App component
│ │ ├── index.css # Global styles
│ │ ├── components/ # Reusable UI components
│ │ │ └── navigation.tsx # Navigation component
│ │ ├── pages/ # Page components
│ │ │ ├── home.tsx # Home page
│ │ │ ├── jobs.tsx # Jobs listing page
│ │ │ └── about.tsx # About page
│ │ ├── services/ # API client services
│ │ │ ├── api.ts # API client
│ │ │ └── websocket.ts # WebSocket client
│ │ ├── styles/ # Styling and themes
│ │ │ └── theme.ts # MUI theme configuration
│ │ └── utils/ # Frontend utilities
│ │ └── helpers.ts # Helper functions
│ ├── public/ # Static assets
│ ├── tests/ # Frontend tests
│ ├── package.json # Node dependencies
│ ├── vite.config.ts # Vite configuration
│ ├── tsconfig.json # TypeScript configuration
│ └── index.html # HTML template

├── core/ # Core Processing Library
│ ├── src/
│ │ ├── __init__.py
│ │ ├── download/ # Download functionality
│ │ │ ├── __init__.py
│ │ │ └── ytdlp_wrapper.py # yt-dlp wrapper
│ │ ├── transcription/ # Transcription functionality
│ │ │ ├── __init__.py
│ │ │ ├── whisper_wrapper.py # Whisper wrapper
│ │ │ └── model_manager.py # Model management
│ │ ├── processing/ # Audio processing
│ │ │ ├── __init__.py
│ │ │ ├── audio_converter.py # Audio format conversion
│ │ │ └── format_handlers.py # Output format handlers
│ │ └── shared/ # Shared utilities
│ │ ├── __init__.py
│ │ ├── types.py # Type definitions
│ │ └── utils.py # Common utilities
│ ├── tests/ # Core library tests
│ ├── setup.py # Package setup
│ └── README.md # Core library documentation

├── docs/ # Documentation
│ ├── specs/ # Technical specifications
│ │ ├── sdd.md # System Design Document
│ │ └── tdd.md # Technical Design Document
│ ├── architecture/ # Architecture documentation
│ │ └── adr-001-model-selection.md # Architecture Decision Records
│ ├── api/ # API documentation
│ │ └── README.md # API guide (to be created)
│ └── guides/ # User guides
│ ├── user-guide.md # End-user guide
│ └── developer-guide.md # Developer guide

├── config/ # Shared configuration files
│ └── (configuration files)

├── data/ # Data directories (git-ignored)
│ ├── input/ # Input files
│ │ └── .gitkeep
│ ├── output/ # Transcription outputs
│ │ └── .gitkeep
│ ├── cache/ # Temporary cache
│ │ └── .gitkeep
│ └── models/ # Whisper model storage
│ └── .gitkeep

├── scripts/ # Utility scripts
│ └── (CLI scripts, automation)

├── .github/ # GitHub configuration
│ └── workflows/ # GitHub Actions workflows

├── .env.example # Environment variables template
├── .gitignore # Git ignore rules
├── docker-compose.yml # Docker Compose configuration
├── Dockerfile # Docker image definition
├── LICENSE # Proprietary license
├── Makefile # Build automation
├── project-structure.md # This file
└── README.md # Project overview


Key Files Description

Root Level

  • README.md: Project overview, features, and quick start guide
  • LICENSE: Proprietary software license (AZ1.AI INC.)
  • Makefile: Build automation and development commands
  • Dockerfile: Container image for backend service
  • docker-compose.yml: Multi-container orchestration
  • .env.example: Environment variables template
  • .gitignore: Files and directories to exclude from git

Backend (/backend)

Main Application:

  • src/main.py: FastAPI application initialization and routing

API Layer:

  • api/routes/transcription.py: Transcription job endpoints
  • api/routes/batch.py: Batch processing endpoints
  • api/routes/health.py: Health check and monitoring

Configuration:

  • config/settings.py: Application settings using Pydantic
  • config/logging_config.py: Logging setup and configuration

Data Models:

  • models/schemas.py: Request/response models and validation

Services:

  • services/download_service.py: YouTube audio download
  • services/transcription_service.py: Whisper integration
  • services/processing_service.py: Audio processing pipeline
  • services/job_manager.py: Job queue and status tracking

Dependencies:

  • requirements.txt: Python package dependencies

Frontend (/frontend)

Application:

  • src/main.tsx: React application entry point
  • src/app.tsx: Main application component with routing

Pages:

  • pages/home.tsx: Transcription submission interface
  • pages/jobs.tsx: Job monitoring and management
  • pages/about.tsx: Application information

Components:

  • components/navigation.tsx: Main navigation bar

Services:

  • services/api.ts: REST API client
  • services/websocket.ts: WebSocket for real-time updates

Configuration:

  • vite.config.ts: Vite build configuration
  • tsconfig.json: TypeScript compiler options
  • package.json: npm dependencies and scripts

Core Library (/core)

Download Module:

  • download/ytdlp_wrapper.py: yt-dlp integration

Transcription Module:

  • transcription/whisper_wrapper.py: OpenAI Whisper wrapper
  • transcription/model_manager.py: Model caching and loading

Processing Module:

  • processing/audio_converter.py: FFmpeg integration
  • processing/format_handlers.py: Output format generation

Shared:

  • shared/types.py: Common type definitions
  • shared/utils.py: Utility functions

Documentation (/docs)

Specifications:

  • specs/sdd.md: System architecture and design
  • specs/tdd.md: Technical implementation details

Architecture:

  • architecture/adr-001-model-selection.md: Model selection rationale

Guides:

  • guides/user-guide.md: End-user documentation
  • guides/developer-guide.md: Development setup and practices

Data Flow

  1. User Input → Frontend (React)
  2. API Request → Backend (FastAPI)
  3. Job Creation → Job Manager
  4. Download → Download Service → yt-dlp → Audio File
  5. Process → Processing Service → FFmpeg → Normalized Audio
  6. Transcribe → Transcription Service → Whisper → Text Output
  7. Format → Format Handlers → Multiple Output Files
  8. Response → Frontend → User Download

Development Workflow

Backend Development

cd backend
pip install -r requirements.txt
uvicorn src.main:app --reload

Frontend Development

cd frontend
npm install
npm run dev

Core Library Development

cd core
pip install -e .
pytest tests/

Configuration

Environment Variables

Configure in .env file (copy from .env.example):

  • Application settings (DEBUG, PORT, etc.)
  • Whisper model configuration
  • Processing limits
  • Cache settings

Model Storage

Whisper models are downloaded to data/models/ on first use.

Output Files

Transcription results are saved to data/output/.


Testing

Backend Tests

cd backend
pytest tests/unit/
pytest tests/integration/

Frontend Tests

cd frontend
npm test

Deployment

Docker

docker-compose build
docker-compose up -d

Production

  • See docs/guides/developer-guide.md for deployment instructions

Copyright (c) 2025 AZ1.AI INC. All Rights Reserved.

Developer: Hal Casteel, CEO/CTO Email: 1@az1.ai


Last Updated: 2025-11-07