Getting Started

Get up and running with Universal AI API in minutes. One API key, 1500+ models, every modality.

What is Universal AI API?

Universal AI API is an OpenAI-compatible gateway that gives you unified access to over 1,500 AI models across every modality — text, image, audio, video, embeddings, translation, and more — through a single API endpoint.

Instead of managing separate API keys, SDKs, and request formats for OpenAI, Anthropic, Google, Mistral, and dozens of other providers, you use one API key and one consistent interface. Universal AI API handles provider routing, caching, and failover automatically.

Key benefits:

  • OpenAI-compatible — use any existing OpenAI SDK by changing the base URL
  • 1,500+ models — access models from OpenAI, Anthropic, Google, Mistral, Meta, xAI, Groq, and more
  • Every modality — text generation, image generation, audio transcription, text-to-speech, embeddings, and beyond
  • Edge-native — runs on Cloudflare's global network for low-latency responses worldwide
  • Smart routing — automatic model selection based on cost, speed, and quality when you don't specify a model
  • Semantic caching — reduce costs and latency with three-tier caching (exact match, semantic similarity, prefix)

Quick Start

1. Get an API key

Create an API key by sending a POST request to the admin endpoint:

curl -X POST https://api.universal-ai.dev/v1/admin/keys \
  -H "Content-Type: application/json" \
  -d '{"name": "my-first-key"}'

The response includes your API key. Save it somewhere safe — it won't be shown again.

2. Make your first request

Send a chat completion request using your API key:

curl https://api.universal-ai.dev/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

3. Use with existing SDKs

Universal AI API is a drop-in replacement for the OpenAI SDK. Just change the base URL.

Python:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.universal-ai.dev/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

JavaScript / TypeScript:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.universal-ai.dev/v1",
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "user", content: "What is the capital of France?" },
  ],
});

console.log(response.choices[0].message.content);

4. Try different models

Access any supported model by changing the model parameter. The model ID format is provider/model-name:

# Use Anthropic Claude
curl https://api.universal-ai.dev/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-20250514",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
# Use Cloudflare Workers AI (Llama)
curl https://api.universal-ai.dev/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cf/llama-3.3-70b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

You can also omit the model parameter entirely, and the smart routing engine will select the best model for your request based on complexity, cost, and availability.

Next Steps

  • Authentication — learn about API key management and rate limits
  • Chat Completions — full reference for the text generation endpoint
  • Models — browse available models and providers
  • Caching — understand the caching system and control cache behavior