Skip to Content
Observability

LLM API

Welcome to the LLM API documentation. This section provides detailed information about the LLM API, including endpoints, parameters, and response formats.

Authentication

All API requests require authentication. You can authenticate using API keys or OAuth tokens. For more information, see the Authentication documentation.

Base URL

https://llm.do/api

Endpoints

Generate Text

POST /generate

Generates text based on the provided prompt.

Request Body

{ "prompt": "Write a short story about a robot learning to paint.", "model": "openai/gpt-4", "maxTokens": 500, "temperature": 0.7, "topP": 0.9, "frequencyPenalty": 0, "presencePenalty": 0 }

Response

{ "id": "generation_123", "text": "In a small studio apartment overlooking the city, Robot Unit 7 stood motionless before a blank canvas...", "model": "openai/gpt-4", "promptTokens": 10, "completionTokens": 487, "totalTokens": 497, "createdAt": "2023-01-01T00:00:00Z" }

Chat Completion

POST /chat

Generates a response to a conversation.

Request Body

{ "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the capital of France?" } ], "model": "anthropic/claude-3-opus", "maxTokens": 100, "temperature": 0.5, "topP": 0.9, "frequencyPenalty": 0, "presencePenalty": 0 }

Response

{ "id": "chat_123", "message": { "role": "assistant", "content": "The capital of France is Paris. It's known as the 'City of Light' and is famous for landmarks such as the Eiffel Tower, the Louvre Museum, and Notre-Dame Cathedral." }, "model": "anthropic/claude-3-opus", "promptTokens": 25, "completionTokens": 35, "totalTokens": 60, "createdAt": "2023-01-01T00:00:00Z" }

Embeddings

POST /embeddings

Generates embeddings for the provided text.

Request Body

{ "text": "The quick brown fox jumps over the lazy dog.", "model": "openai/text-embedding-3-large" }

Response

{ "id": "embedding_123", "embedding": [0.1, 0.2, 0.3, ...], "model": "openai/text-embedding-3-large", "tokens": 10, "createdAt": "2023-01-01T00:00:00Z" }

List Models

GET /models

Returns a list of available language models.

Response

{ "data": [ { "id": "openai/gpt-4", "provider": "openai", "name": "gpt-4", "capabilities": ["chat", "code", "reasoning"], "maxTokens": 8192 }, { "id": "anthropic/claude-3-opus", "provider": "anthropic", "name": "claude-3-opus", "capabilities": ["chat", "reasoning", "vision"], "maxTokens": 100000 }, { "id": "openai/text-embedding-3-large", "provider": "openai", "name": "text-embedding-3-large", "capabilities": ["embeddings"], "dimensions": 3072 } ] }

Get Model

GET /models/:id

Returns details for a specific model.

Response

{ "id": "openai/gpt-4", "provider": "openai", "name": "gpt-4", "capabilities": ["chat", "code", "reasoning"], "maxTokens": 8192, "pricing": { "input": 0.00003, "output": 0.00006 }, "createdAt": "2023-01-01T00:00:00Z" }

Function Calling

POST /function-calling

Executes a function call using a language model.

Request Body

{ "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What's the weather like in New York?" } ], "functions": [ { "name": "getWeather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use" } }, "required": ["location"] } } ], "model": "openai/gpt-4", "temperature": 0.5 }

Response

{ "id": "function_call_123", "functionCall": { "name": "getWeather", "arguments": { "location": "New York, NY", "unit": "fahrenheit" } }, "model": "openai/gpt-4", "promptTokens": 50, "completionTokens": 20, "totalTokens": 70, "createdAt": "2023-01-01T00:00:00Z" }

Error Handling

The API uses standard HTTP status codes to indicate the success or failure of requests. For more information about error codes and how to handle errors, see the Error Handling documentation.

Rate Limiting

API requests are subject to rate limiting to ensure fair usage and system stability. For more information about rate limits and how to handle rate-limiting responses, see the Rate Limiting documentation.

Webhooks

You can configure webhooks to receive notifications about LLM events. For more information about webhooks, see the Webhooks documentation.

SDKs

We provide SDKs for popular programming languages to make it easier to integrate with the LLM API:

Last updated on