Перейти к основному содержанию
POST
/
v1
/
chat
/
completions
cURL
curl https://api.routify.ru/v1/chat/completions \
  -H "Authorization: Bearer $ROUTIFY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ]
  }'
{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "content": "<string>"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1,
    "prompt_tokens_details": {
      "cached_tokens": 1,
      "cache_write_tokens": 1
    }
  }
}

Примеры

curl https://api.routify.ru/v1/chat/completions \
  -H "Authorization: Bearer $ROUTIFY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "system", "content": "Ты — полезный ассистент."},
      {"role": "user", "content": "Объясни, что такое API"}
    ]
  }'

Стриминг

При stream: true ответ передаётся через SSE. Каждый чанк имеет вид:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"привет"},"index":0}]}

data: [DONE]
Неизвестные параметры передаются провайдеру как есть. Ошибки для неподдерживаемых параметров (например, tools, response_format) приходят от провайдера, а не от Routify.

Авторизации

Authorization
string
header
обязательно

Bearer authentication header of the form Authorization: Bearer $ROUTIFY_API_KEY.

Тело

application/json
model
string
обязательно

Model ID used to generate the response. Use GET /v1/models to list all available models.

messages
object[]
обязательно

A list of messages comprising the conversation so far.

Minimum array length: 1
stream
boolean
по умолчанию:false

If set to true, the response is streamed to the client as it is generated using server-sent events. The stream ends with data: [DONE].

max_tokens
integer

An upper bound for the number of tokens that can be generated in the completion.

Требуемый диапазон: x >= 1
reasoning_effort
enum<string>

Constrains effort on reasoning for reasoning models (e.g. o3, o4-mini). Supported values: low, medium, high, xhigh. Lower effort reduces latency and cost; higher effort improves accuracy on complex tasks.

Доступные опции:
low,
medium,
high,
xhigh
verbosity
enum<string>

Controls verbosity of the model response.

Доступные опции:
low,
medium,
high
reasoningSummary
enum<string>

Controls the format of reasoning summaries in the response. Supported values: auto, detail, concise.

Доступные опции:
auto,
detail,
concise
temperature
number

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We recommend altering this or top_p but not both.

Требуемый диапазон: 0 <= x <= 2
top_p
number

An alternative to sampling with temperature, called nucleus sampling, where the model considers only the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We recommend altering this or temperature but not both.

Требуемый диапазон: 0 <= x <= 1
stop

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

Ответ

Successful response (JSON or SSE stream)

id
string
обязательно

A unique identifier for the chat completion.

object
string
обязательно

The object type. Always chat.completion.

Allowed value: "chat.completion"
created
integer
обязательно

The Unix timestamp (in seconds) of when the chat completion was created.

model
string
обязательно

The model used for the chat completion.

choices
object[]
обязательно

A list of chat completion choices.

Minimum array length: 1
usage
object
обязательно