Embeddings

Overview

The embeddings endpoint converts text into numerical vector representations. These vectors capture the semantic meaning of the input and are useful for search, clustering, recommendations, and classification tasks.

POST https://api.universal-ai.dev/v1/embeddings

Request

Headers

Header	Required	Description
`Authorization`	Yes	`Bearer YOUR_API_KEY`
`Content-Type`	Yes	`application/json`

Body Parameters

Parameter	Type	Required	Description
`model`	string	Yes	The embedding model to use (e.g., `text-embedding-3-small`).
`input`	string or array	Yes	The text to embed. Can be a single string or an array of strings.
`encoding_format`	string	No	Format of the returned embeddings: `float` (default) or `base64`.
`dimensions`	integer	No	Number of output dimensions (supported by some models). Reduces vector size for storage efficiency.

Example Request

curl https://api.universal-ai.dev/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Universal AI API provides unified access to AI models."
  }'

Batch Embedding

Embed multiple texts in a single request by passing an array:

curl https://api.universal-ai.dev/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": [
      "First document to embed.",
      "Second document to embed.",
      "Third document to embed."
    ]
  }'

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0091, 0.0152, ...]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

Response Fields

Field	Type	Description
`object`	string	Always `list`.
`data`	array	Array of embedding objects.
`data[].object`	string	Always `embedding`.
`data[].index`	integer	Index of the embedding in the input array.
`data[].embedding`	array	The embedding vector (array of floats).
`model`	string	The model used.
`usage`	object	Token usage for the request.

SDK Examples

Python:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.universal-ai.dev/v1"
)

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="Universal AI API provides unified access to AI models."
)

vector = response.data[0].embedding
print(f"Vector dimensions: {len(vector)}")

JavaScript / TypeScript:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://api.universal-ai.dev/v1",
});

const response = await client.embeddings.create({
  model: "text-embedding-3-small",
  input: "Universal AI API provides unified access to AI models.",
});

const vector = response.data[0].embedding;
console.log(`Vector dimensions: ${vector.length}`);

Supported Models

Model ID	Provider	Dimensions	Description
`text-embedding-3-small`	OpenAI	1,536	Fast, cost-effective embedding model
`text-embedding-3-large`	OpenAI	3,072	Higher quality, larger vectors
`text-embedding-ada-002`	OpenAI	1,536	Legacy model, widely compatible
`cf/bge-base-en-v1.5`	Cloudflare	768	BGE base model on Workers AI
`cf/bge-large-en-v1.5`	Cloudflare	1,024	BGE large model on Workers AI
`mistral/mistral-embed`	Mistral	1,024	Mistral embedding model

Reducing Dimensions

Some models support the dimensions parameter to produce shorter vectors. Shorter vectors use less storage and are faster to compare, with a small trade-off in accuracy:

curl https://api.universal-ai.dev/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-large",
    "input": "Reduce my vector size.",
    "dimensions": 512
  }'

Use Cases

Semantic search — embed documents and queries, then find the nearest vectors
Clustering — group similar texts by comparing their embeddings
Classification — use embeddings as features for ML classifiers
Recommendations — find items similar to a user's preferences
Deduplication — identify near-duplicate content by vector similarity

On this page