Endpoints

Base URL: https://api.mantler.ai

All endpoints that accept a body expect Content-Type: application/json.

GET /health

Gateway health check. No authentication required.

Response:

{ "status": "ok" }

GET /v1/models

List the models available to your API key.

Request:

GET /v1/models
Authorization: Bearer mk_live_YOUR_KEY

Response:

{
  "object": "list",
  "data": [
    {
      "id": "llama3.1:8b",
      "object": "model",
      "created": 1713000000,
      "owned_by": "mantler"
    }
  ]
}

Each id is the model identifier to use in chat completions requests. Model IDs correspond to the model layer of a deployed mantle.

POST /v1/chat/completions

Generate a chat completion. Follows the OpenAI chat completions spec.

Request:

POST /v1/chat/completions
Authorization: Bearer mk_live_YOUR_KEY
Content-Type: application/json

{
  "model": "llama3.1:8b",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is the capital of France?" }
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 512
}

Parameter	Type	Required	Description
`model`	string	Yes	Model ID from `/v1/models`
`messages`	array	Yes	Conversation history
`stream`	boolean	No	Stream tokens as SSE. Default `false`.
`temperature`	number	No	Sampling temperature (0–2). Default `1`.
`max_tokens`	integer	No	Max tokens to generate.
`top_p`	number	No	Nucleus sampling. Default `1`.
`stop`	string\|array	No	Stop sequences.

Non-streaming response:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1713000000,
  "model": "llama3.1:8b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 9,
    "total_tokens": 33
  }
}

Streaming response ("stream": true):

Returns server-sent events (SSE). Each event is a data: line containing a partial completion object. The stream ends with data: [DONE].

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":"The"},"index":0}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"content":" capital"},"index":0}]}

data: [DONE]

Error responses

The API returns standard HTTP status codes with a JSON error body:

{
  "error": {
    "message": "Invalid API key",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Status	Meaning
`401`	Invalid or missing API key
`403`	Key does not have access to the requested model
`404`	Model not found or not deployed
`429`	Rate limit exceeded
`503`	Machine offline or runtime unavailable

Endpoints

GET /health

GET /v1/models

POST /v1/chat/completions

Error responses

On this page