CODITECT Custom/Local Model Executor
Multi-model AI execution framework - Custom/Local provider implementation. Provides standardized interface for executing prompts against local or custom-hosted models (Ollama, LM Studio, vLLM, text-generation-webui, etc.)
Features:
- OpenAI-compatible API support (most local servers)
- Ollama native API support
- Configurable endpoint URLs
- Streaming and batch response modes
- Token estimation (local models may not provide counts)
- Automatic retry with exponential backoff
- Model capability detection
Supported Backends:
- Ollama (http://localhost:11434)
- LM Studio (http://localhost:1234)
- vLLM (OpenAI-compatible)
- text-generation-webui (OpenAI-compatible)
- LocalAI (OpenAI-compatible)
- Any OpenAI-compatible endpoint
Usage: from core.execute_custom import CustomExecutor
# Ollama
executor = CustomExecutor(model="llama3.2", backend="ollama")
response = executor.execute("Explain quantum computing")
# OpenAI-compatible endpoint
executor = CustomExecutor(
model="mistral-7b",
backend="openai",
base_url="http://localhost:1234/v1"
)
Environment Variables: OLLAMA_HOST - Ollama server URL (default: http://localhost:11434) LOCAL_LLM_BASE_URL - Default base URL for OpenAI-compatible servers LOCAL_LLM_API_KEY - API key if required (often "not-needed" for local)
Author: AZ1.AI INC. Framework: CODITECT Copyright: © 2025 AZ1.AI INC. All rights reserved.
File: execute_custom.py
Classes
ExecutionResult
Result from custom model execution.
CustomExecutor
Custom/Local model executor with standardized interface.
Functions
main()
CLI interface for custom executor.
to_dict()
No description
execute(prompt, system, messages, max_tokens, temperature)
Execute prompt against custom/local model.
stream(prompt, system, max_tokens)
Stream response from custom/local model.
list_models()
List available models on the server.
Usage
python execute_custom.py