Configuration
This guide covers how to configure OneLLM for different providers and use cases.
Environment Variables
OneLLM uses environment variables for API keys and configuration.
Provider API Keys
Set API keys for the providers you want to use:
# OpenAI
export OPENAI_API_KEY="sk-..."
# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
# Google AI Studio
export GOOGLE_API_KEY="..."
# Mistral
export MISTRAL_API_KEY="..."
# Groq
export GROQ_API_KEY="..."
# X.AI
export XAI_API_KEY="..."
# And more...
OneLLM-Specific Variables
Configure OneLLM behavior:
# Set default timeout (seconds)
export ONELLM_TIMEOUT=60
# Set default max retries
export ONELLM_MAX_RETRIES=3
# Set logging level
export ONELLM_LOG_LEVEL=INFO
Using .env Files
Create a .env
file in your project:
# Provider API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
MISTRAL_API_KEY=...
# OneLLM Configuration
ONELLM_TIMEOUT=60
ONELLM_MAX_RETRIES=3
Load in Python:
from dotenv import load_dotenv
load_dotenv()
from onellm import OpenAI
client = OpenAI()
Programmatic Configuration
Client Configuration
Configure the client at initialization:
from onellm import OpenAI
client = OpenAI(
api_key="sk-...", # Override environment variable
timeout=120, # Custom timeout
max_retries=5 # Custom retry count
)
Provider-Specific Configuration
Some providers need special configuration:
Azure OpenAI
Create azure.json
:
{
"endpoint": "https://your-name.openai.azure.com",
"api_key": "your-azure-key",
"api_version": "2024-02-01",
"deployments": {
"gpt-4": "your-gpt4-deployment",
"gpt-35-turbo": "your-gpt35-deployment"
}
}
Use in code:
client = OpenAI(azure_config_path="azure.json")
AWS Bedrock
Create bedrock.json
:
{
"region": "us-east-1",
"aws_access_key_id": "AKIA...",
"aws_secret_access_key": "..."
}
Vertex AI
Set service account:
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"
Model Configuration
Model Aliases
Create custom model aliases:
from onellm import OpenAI
# Create client with model mappings
client = OpenAI()
# Use provider/model format
response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "Hello"}]
)
Default Parameters
Set default parameters for all requests:
from functools import partial
# Create a configured create method
create_chat = partial(
client.chat.completions.create,
temperature=0.7,
max_tokens=500,
top_p=0.9
)
# Use with defaults
response = create_chat(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}]
)
Runtime Configuration
Configure OneLLM behavior at runtime:
import onellm
# Set API keys programmatically
onellm.openai_api_key = "sk-..." # OpenAI API key
onellm.anthropic_api_key = "sk-..." # Anthropic API key
# Configure fallback behavior
onellm.config.fallback = {
"enabled": True,
"default_chains": {
"chat": ["openai/gpt-4", "anthropic/claude-3-opus", "groq/llama3-70b"],
"embedding": ["openai/text-embedding-3-small", "cohere/embed-english"]
},
"retry_delay": 1.0,
"max_retries": 3
}
Advanced Configuration
Retry Configuration
Configure retry behavior:
from onellm import OpenAI
from onellm.utils.retry import RetryConfig
client = OpenAI(
retry_config=RetryConfig(
max_retries=5,
initial_backoff=1.0,
max_backoff=60.0,
exponential_base=2.0
)
)
Timeout Configuration
Set different timeouts:
# Global timeout
client = OpenAI(timeout=120)
# Per-request timeout
response = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
timeout=30 # Override for this request
)
Logging Configuration
Configure logging:
import logging
# Set logging level
logging.basicConfig(level=logging.DEBUG)
# Or configure OneLLM logger specifically
logger = logging.getLogger("onellm")
logger.setLevel(logging.INFO)
Local Model Configuration
Ollama Configuration
# Default Ollama endpoint
client = OpenAI() # Uses http://localhost:11434
# Custom Ollama endpoint
response = client.chat.completions.create(
model="ollama/llama3:8b@192.168.1.100:11434",
messages=[{"role": "user", "content": "Hello"}]
)
llama.cpp Configuration
# Set model directory
import os
os.environ["LLAMA_CPP_MODEL_DIR"] = "/path/to/models"
# Configure GPU layers
client = OpenAI(
llama_cpp_config={
"n_gpu_layers": 35, # GPU acceleration
"n_ctx": 4096, # Context window
"n_threads": 8 # CPU threads
}
)
Security Best Practices
1. Never Hardcode API Keys
❌ Bad:
client = OpenAI(api_key="sk-1234567890")
✅ Good:
client = OpenAI() # Uses environment variable
2. Use Separate Keys for Different Environments
# Development
export OPENAI_API_KEY="sk-dev-..."
# Production
export OPENAI_API_KEY="sk-prod-..."
3. Rotate Keys Regularly
Keep track of key usage and rotate periodically.
4. Use Key Restrictions
Many providers allow restricting keys by:
- IP address
- Usage limits
- Specific models
Configuration Files
CLAUDE.md
For Claude.ai Code assistant, create CLAUDE.md
:
# CLAUDE.md
This project uses OneLLM for LLM interactions.
## Configuration
- API keys are in .env file
- Default model: openai/gpt-4o-mini
- Timeout: 60 seconds
## Common Commands
- Run tests: pytest
- Format: black .
- Lint: ruff check .
pyproject.toml
Configure OneLLM in pyproject.toml
:
[tool.onellm]
default_provider = "openai"
default_model = "gpt-4o-mini"
timeout = 60
max_retries = 3
Troubleshooting Configuration
Check Current Configuration
from onellm.config import config
# Print current configuration
print(config)
# Check specific provider
print(config["providers"]["openai"])
Validate API Keys
import os
providers = ["OPENAI", "ANTHROPIC", "GOOGLE", "MISTRAL"]
for provider in providers:
key_name = f"{provider}_API_KEY"
if os.environ.get(key_name):
print(f"✅ {key_name} is set")
else:
print(f"❌ {key_name} is not set")
Next Steps
- Provider Setup - Detailed provider configuration
- Best Practices - Configuration best practices
- Troubleshooting - Common configuration issues