CODITECT Custom/Local Model Executor

Multi-model AI execution framework - Custom/Local provider implementation. Provides standardized interface for executing prompts against local or custom-hosted models (Ollama, LM Studio, vLLM, text-generation-webui, etc.)

Features:

OpenAI-compatible API support (most local servers)
Ollama native API support
Configurable endpoint URLs
Streaming and batch response modes
Token estimation (local models may not provide counts)
Automatic retry with exponential backoff
Model capability detection

Supported Backends:

Ollama (http://localhost:11434)
LM Studio (http://localhost:1234)
vLLM (OpenAI-compatible)
text-generation-webui (OpenAI-compatible)
LocalAI (OpenAI-compatible)
Any OpenAI-compatible endpoint

Usage: from core.execute_custom import CustomExecutor

# Ollama
executor = CustomExecutor(model="llama3.2", backend="ollama")
response = executor.execute("Explain quantum computing")

# OpenAI-compatible endpoint
executor = CustomExecutor(
    model="mistral-7b",
    backend="openai",
    base_url="http://localhost:1234/v1"
)

Environment Variables: OLLAMA_HOST - Ollama server URL (default: http://localhost:11434) LOCAL_LLM_BASE_URL - Default base URL for OpenAI-compatible servers LOCAL_LLM_API_KEY - API key if required (often "not-needed" for local)

File: execute_custom.py

Classes

`ExecutionResult`

Result from custom model execution.

`CustomExecutor`

Custom/Local model executor with standardized interface.

Functions

`main()`

CLI interface for custom executor.

`to_dict()`

No description

`execute(prompt, system, messages, max_tokens, temperature)`

Execute prompt against custom/local model.

`stream(prompt, system, max_tokens)`

Stream response from custom/local model.

`list_models()`

List available models on the server.

Usage

python execute_custom.py

Classes​

ExecutionResult​

CustomExecutor​

Functions​

main()​

to_dict()​

execute(prompt, system, messages, max_tokens, temperature)​

stream(prompt, system, max_tokens)​

list_models()​

Usage​