Skip to main content

CODITECT Audio2Text - Core Library

Core processing library for audio transcription functionality.

Modules

download

Handles downloading audio from YouTube and other sources using yt-dlp.

transcription

Manages Whisper model loading and audio-to-text transcription.

processing

Audio processing utilities including format conversion and preprocessing.

shared

Shared utilities, types, and helpers used across modules.

Installation

pip install -e .

Usage

from coditect.download import YouTubeDownloader
from coditect.transcription import WhisperTranscriber

# Download audio
downloader = YouTubeDownloader()
audio_file = downloader.download("https://youtube.com/watch?v=...")

# Transcribe
transcriber = WhisperTranscriber(model="base")
result = transcriber.transcribe(audio_file)
print(result['text'])

Development

Run tests:

pytest tests/