A powerful, production-ready gateway for managing multiple LLM providers with built-in failover, guardrails, caching, and monitoring
Seamlessly integrate with OpenAI, Anthropic, Gemini, Ollama, Mistral, and Cohere
OpenAI API compatible HTTP interface and high performance gRPC interface
Automatic failover between providers ensures high availability
Built-in caching system with configurable TTL for cost savings
Fully configurable rate limiting policy
Monitor usage, tokens, errors and configure gateway
Configurable content filtering and safety measures
Integration with Splunk, Datadog, and Elasticsearch
Intercept and inject system prompts for all outgoing requests
[openAIConfig]
apiKey = "Your_API_Key"
model = "gpt-4"
endpoint = "https://api.openai.com"
Create a Config.toml
file with your API configuration
docker run -p 8080:8080 -p 8081:8081 -p 8082:8082 \
-v $(pwd)/Config.toml:/home/ballerina/Config.toml \
chintana/ai-gateway:v1.1.0
Start the container with your configuration mounted
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-llm-provider: openai" \
-d '{"messages": [{"role": "user","content": "When will we have AGI? In 10 words"}]}'
Start making API requests to your gateway