Embeddings

Generate vector embeddings for text using the embeddings endpoint.

Overview

The embeddings endpoint converts text into numerical vector representations. These vectors capture the semantic meaning of the input and are useful for search, clustering, recommendations, and classification tasks.

POST https://api.universal-ai.dev/v1/embeddings

Request

Headers

HeaderRequiredDescription
AuthorizationYesBearer YOUR_API_KEY
Content-TypeYesapplication/json

Body Parameters

ParameterTypeRequiredDescription
modelstringYesThe embedding model to use (e.g., text-embedding-3-small).
inputstring or arrayYesThe text to embed. Can be a single string or an array of strings.
encoding_formatstringNoFormat of the returned embeddings: float (default) or base64.
dimensionsintegerNoNumber of output dimensions (supported by some models). Reduces vector size for storage efficiency.

Example Request

curl https://api.universal-ai.dev/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Universal AI API provides unified access to AI models."
  }'

Batch Embedding

Embed multiple texts in a single request by passing an array:

curl https://api.universal-ai.dev/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": [
      "First document to embed.",
      "Second document to embed.",
      "Third document to embed."
    ]
  }'

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0152, ...]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

Response Fields

FieldTypeDescription
objectstringAlways list.
dataarrayArray of embedding objects.
data[].objectstringAlways embedding.
data[].indexintegerIndex of the embedding in the input array.
data[].embeddingarrayThe embedding vector (array of floats).
modelstringThe model used.
usageobjectToken usage for the request.

SDK Examples

Python:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.universal-ai.dev/v1"
)

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="Universal AI API provides unified access to AI models."
)

vector = response.data[0].embedding
print(f"Vector dimensions: {len(vector)}")

JavaScript / TypeScript:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.universal-ai.dev/v1",
});

const response = await client.embeddings.create({
  model: "text-embedding-3-small",
  input: "Universal AI API provides unified access to AI models.",
});

const vector = response.data[0].embedding;
console.log(`Vector dimensions: ${vector.length}`);

Supported Models

Model IDProviderDimensionsDescription
text-embedding-3-smallOpenAI1,536Fast, cost-effective embedding model
text-embedding-3-largeOpenAI3,072Higher quality, larger vectors
text-embedding-ada-002OpenAI1,536Legacy model, widely compatible
cf/bge-base-en-v1.5Cloudflare768BGE base model on Workers AI
cf/bge-large-en-v1.5Cloudflare1,024BGE large model on Workers AI
mistral/mistral-embedMistral1,024Mistral embedding model

Reducing Dimensions

Some models support the dimensions parameter to produce shorter vectors. Shorter vectors use less storage and are faster to compare, with a small trade-off in accuracy:

curl https://api.universal-ai.dev/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-large",
    "input": "Reduce my vector size.",
    "dimensions": 512
  }'

Use Cases

  • Semantic search — embed documents and queries, then find the nearest vectors
  • Clustering — group similar texts by comparing their embeddings
  • Classification — use embeddings as features for ML classifiers
  • Recommendations — find items similar to a user's preferences
  • Deduplication — identify near-duplicate content by vector similarity