Documentation
Getting Started
The AI Gateway provides a unified interface for multiple LLM providers with built-in features for production deployments.
First create a file named Config.toml
with following contents
Config.toml - minimal example with just one provider
[openAIConfig]
apiKey = "sk-..."
model = "gpt-4o"
endpoint = "https://api.openai.com"
Quick Start with Docker
docker run \
-p 8080:8080 -p 8081:8081 -p 8082:8082 \
-v $(pwd)/Config.toml:/home/ballerina/Config.toml \
chintana/ai-gateway:v1.1.0
Usage
Using OpenAI API compatible HTTP interface
Send chat completion requests to the gateway using the /v1/chat/completions
endpoint. Select the provider using the x-llm-provider
header.
Chat Request Example
curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'x-llm-provider: ollama' \
--header 'Content-Type: application/json' \
--data '{
"messages": [{
"role": "user",
"content": "When will we have AGI? In 10 words"
}]
}'
Using gRPC interface
See GitHub repo for full example under grpc-client
folder
gRPC Request Example
def run():
# Create a gRPC channel
channel = grpc.insecure_channel('localhost:8082')
# Create a stub (client)
stub = ai_gateway_pb2_grpc.AIGatewayStub(channel)
# Create a request
request = ai_gateway_pb2.ChatCompletionRequest(
llm_provider="ollama",
messages=[
ai_gateway_pb2.Message(
role="system",
content="You are a helpful assistant."
),
ai_gateway_pb2.Message(
role="user",
content="What is the capital of France?"
)
]
)
try:
# Make the call
response = stub.ChatCompletion(request)
# ...
More info on GitHub
Please refer README file in GitHub repo for more information on how to use the AI Gateway