Documentation

Getting Started

The AI Gateway provides a unified interface for multiple LLM providers with built-in features for production deployments. First create a file named Config.toml with following contents

Config.toml - minimal example with just one provider

[openAIConfig]
apiKey = "sk-..."
model = "gpt-4o"
endpoint = "https://api.openai.com"

Quick Start with Docker

docker run \
-p 8080:8080 -p 8081:8081 -p 8082:8082 \
-v $(pwd)/Config.toml:/home/ballerina/Config.toml \
chintana/ai-gateway:v1.2.0

Usage

Using OpenAI API compatible HTTP interface

Send chat completion requests to the gateway using the /v1/chat/completions endpoint. Select the provider using the x-llm-provider header.

Chat Request Example

curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'x-llm-provider: ollama' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [{ 
        "role": "user",
        "content": "When will we have AGI? In 10 words" 
        }]
}'

Using gRPC interface

See GitHub repo for full example under grpc-client folder

gRPC Request Example

def run():
# Create a gRPC channel
channel = grpc.insecure_channel('localhost:8082')

# Create a stub (client)
stub = ai_gateway_pb2_grpc.AIGatewayStub(channel)

# Create a request
request = ai_gateway_pb2.ChatCompletionRequest(
    llm_provider="ollama",
    messages=[
        ai_gateway_pb2.Message(
            role="system",
            content="You are a helpful assistant."
        ),
        ai_gateway_pb2.Message(
            role="user",
            content="What is the capital of France?"
        )
    ]
)

try:
    # Make the call
    response = stub.ChatCompletion(request)

# ...

More info on GitHub

Please refer README file in GitHub repo for more information on how to use the AI Gateway