Embeddings API

The Embeddings API generates vector representations of text, which can be used for search, clustering, recommendations, and other natural language processing tasks.

Endpoint

POST /v1/embeddings

Request Parameters

Parameter	Type	Required	Description
`model`	string	Yes	The ID of the embedding model to use
`input`	string/array	Yes	Text to embed (string or array of strings)
`user`	string	No	Unique identifier for the end-user
`encoding_format`	string	No	Output encoding format (default: 'float')
`dimensions`	integer	No	Specify embedding dimensions for models that support it
`cache_control`	object	No	Control caching behavior
`routing`	object	No	Custom routing options for this request

Basic Example

cURL
Python
JavaScript
C#

curl http://localhost:5000/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer condt_your_virtual_key" \
  -d '{
    "model": "my-embedding-model",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

from openai import OpenAI

client = OpenAI(
    api_key="condt_your_virtual_key",
    base_url="http://localhost:5000/v1"
)

response = client.embeddings.create(
    model="my-embedding-model",
    input="The quick brown fox jumps over the lazy dog"
)

print(f"Embedding dimensions: {len(response.data[0].embedding)}")
print(f"First few values: {response.data[0].embedding[:5]}")

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'condt_your_virtual_key',
  baseURL: 'http://localhost:5000/v1'
});

async function main() {
  const response = await openai.embeddings.create({
    model: 'my-embedding-model',
    input: 'The quick brown fox jumps over the lazy dog'
  });
  
  console.log(`Embedding dimensions: ${response.data[0].embedding.length}`);
  console.log(`First few values: ${response.data[0].embedding.slice(0, 5)}`);
}

main();

using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

class Program
{
    static async Task Main()
    {
        var client = new HttpClient();
        client.BaseAddress = new Uri("http://localhost:5000/");
        client.DefaultRequestHeaders.Add("Authorization", "Bearer condt_your_virtual_key");

        var request = new
        {
            model = "my-embedding-model",
            input = "The quick brown fox jumps over the lazy dog"
        };

        var content = new StringContent(
            JsonSerializer.Serialize(request),
            Encoding.UTF8,
            "application/json");

        var response = await client.PostAsync("v1/embeddings", content);
        var responseBody = await response.Content.ReadAsStringAsync();
        
        Console.WriteLine(responseBody);
    }
}

Multiple Inputs Example

cURL
Python

curl http://localhost:5000/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer condt_your_virtual_key" \
  -d '{
    "model": "my-embedding-model",
    "input": ["The quick brown fox", "jumps over the lazy dog"]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="condt_your_virtual_key",
    base_url="http://localhost:5000/v1"
)

response = client.embeddings.create(
    model="my-embedding-model",
    input=["The quick brown fox", "jumps over the lazy dog"]
)

# Access each embedding
for i, data in enumerate(response.data):
    print(f"Embedding {i+1} dimensions: {len(data.embedding)}")

Response Format

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023064255, -0.009327292, ...],
      "index": 0
    }
  ],
  "model": "my-embedding-model",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Specifying Dimensions

Some embedding models support different dimensions:

cURL
Python

curl http://localhost:5000/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer condt_your_virtual_key" \
  -d '{
    "model": "my-embedding-model",
    "input": "The quick brown fox jumps over the lazy dog",
    "dimensions": 256
  }'

from openai import OpenAI

client = OpenAI(
    api_key="condt_your_virtual_key",
    base_url="http://localhost:5000/v1"
)

response = client.embeddings.create(
    model="my-embedding-model",
    input="The quick brown fox jumps over the lazy dog",
    dimensions=256
)

print(f"Embedding dimensions: {len(response.data[0].embedding)}")

Conduit-Specific Extensions

Cache Control

Control the caching behavior for this specific request:

{
  "model": "my-embedding-model",
  "input": "The quick brown fox jumps over the lazy dog",
  "cache_control": {
    "no_cache": false,
    "ttl": 7200
  }
}

Custom Routing

Override the default routing strategy for this specific request:

{
  "model": "my-embedding-model",
  "input": "The quick brown fox jumps over the lazy dog",
  "routing": {
    "strategy": "least_cost",
    "fallback_enabled": true,
    "provider_override": "cohere"
  }
}

Best Practices

Model Selection

Different embedding models have different characteristics:

Dimensions: Higher dimensions can capture more information but use more storage
Semantic Richness: Some models are better at capturing meaning
Multilingual Support: Some models handle multiple languages better
Speed and Cost: Smaller models are faster and cheaper

Input Truncation

Most embedding models have a token limit. When exceeded:

Inputs are automatically truncated
A warning is included in the response
Consider splitting long texts into smaller chunks

Embeddings Storage

When storing embeddings:

Use vector databases like Pinecone, Weaviate, or Milvus
Or use specialized libraries like FAISS or HNSWLIB
Store model ID with embeddings for compatibility

Error Codes

HTTP Code	Error Type	Description
400	invalid_request_error	The request was malformed
401	authentication_error	Invalid or missing API key
403	permission_error	The API key doesn't have permission
404	not_found_error	The requested model was not found
429	rate_limit_error	Rate limit exceeded
500	server_error	Server error

Next Steps

Chat Completions API: Generate conversational responses
Models API: List and filter available models
Virtual Keys: Learn about API key management

Endpoint​

Request Parameters​

Basic Example​

Multiple Inputs Example​

Response Format​

Specifying Dimensions​

Conduit-Specific Extensions​

Cache Control​

Custom Routing​

Best Practices​

Model Selection​

Input Truncation​

Embeddings Storage​

Error Codes​

Next Steps​