> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tako.com/llms.txt
> Use this file to discover all available pages before exploring further.

# OpenAI-Compatible API

Tako exposes its **Answer** and **Agent** products through an **OpenAI-compatible gateway**. Point the official OpenAI SDK at Tako with just a `base_url` and a `model` — no SDK swap, no new client — and get answers grounded in real-time, trusted data alongside the Tako cards and web results that back them.

| Endpoint            | OpenAI interface                          | Models                      | Use case                                                |
| ------------------- | ----------------------------------------- | --------------------------- | ------------------------------------------------------- |
| `/chat/completions` | [Chat Completions API](#chat-completions) | `tako-answer`, `tako-agent` | Single-shot, stateless requests.                        |
| `/responses`        | [Responses API](#responses-api)           | `tako-answer`, `tako-agent` | Event-based; supports stateful `tako-agent` follow-ups. |

<Info>
  This is a wire-compatible gateway over [Answer](/documentation/integrating-tako/guides/answer) and the [Agent](/documentation/integrating-tako/guides/agent), not a chat model. A `tako-answer` request is a stateless, single-shot query; `tako-agent` runs deep research and, on the Responses API, supports stateful follow-ups. A few OpenAI behaviors don't carry over — see [Compatibility](#compatibility).
</Info>

## Overview

Set two things on your OpenAI client:

* **`base_url`** → `https://tako.com/api/openai/v1`
* **`api_key`** → your [Tako API key](https://developer.tako.com/console/api-keys) (sent as the `Authorization: Bearer` header)

Then choose a `model`:

| Model         | Backed by                                               | Best for                                                                         |
| ------------- | ------------------------------------------------------- | -------------------------------------------------------------------------------- |
| `tako-answer` | [Answer](/documentation/integrating-tako/guides/answer) | A fast, synchronous grounded answer.                                             |
| `tako-agent`  | [Agent](/documentation/integrating-tako/guides/agent)   | Deep research over Tako's knowledge graph and the live web (slower — stream it). |

<Tabs>
  <Tab title="Python">
    ```bash theme={null}
    pip install openai
    ```

    ```python theme={null}
    import os
    from openai import OpenAI

    client = OpenAI(
        base_url="https://tako.com/api/openai/v1",
        api_key=os.environ["TAKO_API_KEY"],
    )
    ```
  </Tab>

  <Tab title="cURL">
    ```bash theme={null}
    # Set your key once for the examples below.
    export TAKO_API_KEY=<your tako key>
    ```
  </Tab>
</Tabs>

## Choosing sources

By default the gateway grounds answers in **both Tako's curated knowledge and the live web**. Pass `source_indexes` to restrict that — an array containing `"data"`, `"web"`, or both. It applies to **both APIs and both models**. Since it isn't a native OpenAI field, pass it through `extra_body` with the SDK (or as a top-level field in a raw request).

<CodeGroup>
  ```python Python theme={null}
  completion = client.chat.completions.create(
      model="tako-answer",
      messages=[{"role": "user", "content": "What is the price of silver?"}],
      extra_body={"source_indexes": ["data"]},   # Tako curated data only; omit to use both
  )
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/chat/completions \
    -H "Authorization: Bearer $TAKO_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "tako-answer",
      "messages": [{"role": "user", "content": "What is the price of silver?"}],
      "source_indexes": ["data"]
    }'
  ```
</CodeGroup>

Allowed values are `"data"` and `"web"`. The legacy value `"tako"` is accepted as a synonym for `"data"`. The list must be non-empty — an empty or invalid list returns `400`. On the Responses API, pass `source_indexes` the same way.

## Chat Completions

### Answer

`tako-answer` returns a synthesized answer in one synchronous call — the OpenAI equivalent of [Answer](/documentation/integrating-tako/guides/answer). The prose lands in `message.content`; the Tako cards and web results that back it ride a namespaced [`tako` extension](#the-tako-extension) on the message.

<CodeGroup>
  ```python Python theme={null}
  completion = client.chat.completions.create(
      model="tako-answer",
      messages=[{"role": "user", "content": "What is the price of silver?"}],
  )

  message = completion.choices[0].message
  print(message.content)                 # the grounded answer
  for card in message.tako["cards"]:      # the backing Tako cards
      print(card["title"], card["embed_url"])
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/chat/completions \
    -H "Authorization: Bearer $TAKO_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "tako-answer",
      "messages": [{"role": "user", "content": "What is the price of silver?"}]
    }'
  ```
</CodeGroup>

<Accordion title="Example response">
  A standard `chat.completion`. The answer is in `choices[0].message.content`; the backing cards and web results are in `choices[0].message.tako`.

  ```json theme={null}
  {
    "id": "chatcmpl-2b7a4f0a9c2e7b1a2c3d4e5f",
    "object": "chat.completion",
    "created": 1750000000,
    "model": "tako-answer",
    "choices": [
      {
        "index": 0,
        "message": {
          "role": "assistant",
          "content": "Silver is trading at about $34.80 per troy ounce ...",
          "tako": {
            "cards": [
              {
                "card_id": "KfYeym50vtsF93LMsIFW",
                "title": "Silver Price",
                "description": "Spot price of silver per troy ounce in USD.",
                "embed_url": "https://tako.com/embed/KfYeym50vtsF93LMsIFW/",
                "webpage_url": "https://tako.com/card/KfYeym50vtsF93LMsIFW/",
                "image_url": "https://tako.com/api/v1/get_image/KfYeym50vtsF93LMsIFW/",
                "sources": [
                  {"source_name": "Tako Markets", "source_index": "tako"}
                ]
              }
            ],
            "web_results": [],
            "request_id": "f20f965b-6bbd-40df-b50b-a8861f34df24"
          }
        },
        "finish_reason": "stop"
      }
    ],
    "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
  }
  ```
</Accordion>

### Agent

`tako-agent` runs the [Agent](/documentation/integrating-tako/guides/agent): deep research over Tako's curated knowledge and the live web. A run can take **minutes**, so prefer [streaming](#streaming). A non-streaming call works too, but Tako holds the connection open until the run finishes — and returns `504` (`run_timeout`, carrying the `run_id`) if the run overruns \~290 seconds.

<CodeGroup>
  ```python Python theme={null}
  completion = client.chat.completions.create(
      model="tako-agent",
      messages=[{"role": "user", "content": "Compare Nvidia and AMD revenue since 2015"}],
  )
  print(completion.choices[0].message.content)
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/chat/completions \
    -H "Authorization: Bearer $TAKO_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "tako-agent",
      "messages": [{"role": "user", "content": "Compare Nvidia and AMD revenue since 2015"}]
    }'
  ```
</CodeGroup>

### Streaming

Set `stream=True` to receive the answer as it's produced — recommended for `tako-agent`. The text arrives in `delta.content` chunks; the Tako cards and web results arrive in a single dedicated `delta.tako` chunk near the end of the stream.

<CodeGroup>
  ```python Python theme={null}
  stream = client.chat.completions.create(
      model="tako-agent",
      messages=[{"role": "user", "content": "Compare Nvidia and AMD revenue since 2015"}],
      stream=True,
  )

  cards = []
  for chunk in stream:
      delta = chunk.choices[0].delta
      if delta.content:
          print(delta.content, end="", flush=True)
      tako = getattr(delta, "tako", None)   # the dedicated cards chunk
      if tako:
          cards = tako["cards"]
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/chat/completions \
    -H "Authorization: Bearer $TAKO_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "tako-agent",
      "messages": [{"role": "user", "content": "Compare Nvidia and AMD revenue since 2015"}],
      "stream": true
    }'
  ```
</CodeGroup>

For token-style usage accounting, pass `stream_options={"include_usage": True}` to emit a final usage frame before `[DONE]` (its values are zero — see [below](#compatibility)).

<Note>
  If a run fails mid-stream, the gateway emits an error chunk and closes the stream **without** a `[DONE]` marker. Treat a stream that ends without `[DONE]` as a failed run.
</Note>

## Responses API

[OpenAI's Responses API](https://platform.openai.com/docs/api-reference/responses) is the newer, event-based interface — `POST /responses`. Same models, auth, and [`tako` extension](#the-tako-extension); the differences are the request field (**`input`** instead of `messages`), the response shape (an **`output`** array), and — for `tako-agent` — **stateful follow-ups**, **background** runs, and a richer streaming event set.

`input` accepts either a plain string or the OpenAI items array; Tako uses the **last `user` turn** as the query. `instructions` is accepted but not applied.

### Answer

`tako-answer` is synchronous — one request, one grounded answer. The text is in `output[].content[].text` (the SDK aggregates it into `response.output_text`); cards and web results ride the top-level `response.tako`.

<CodeGroup>
  ```python Python theme={null}
  response = client.responses.create(
      model="tako-answer",
      input="What is the price of silver?",
  )

  print(response.output_text)             # the grounded answer
  for card in response.tako["cards"]:      # the backing Tako cards
      print(card["title"], card["embed_url"])
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/responses \
    -H "Authorization: Bearer $TAKO_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "tako-answer",
      "input": "What is the price of silver?"
    }'
  ```
</CodeGroup>

<Accordion title="Example response">
  A standard `response` object. The text is in `output[0].content[0].text`; the backing cards and web results are in the top-level `tako` key.

  ```json theme={null}
  {
    "id": "resp_8f0a9c2e7b1a2c3d",
    "object": "response",
    "created_at": 1750000000,
    "model": "tako-answer",
    "status": "completed",
    "output": [
      {
        "type": "message",
        "id": "msg_8f0a9c2e7b1a2c3d",
        "role": "assistant",
        "status": "completed",
        "content": [
          {"type": "output_text", "text": "Silver is trading at about $34.80 per troy ounce ...", "annotations": []}
        ]
      }
    ],
    "usage": {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0},
    "tako": {
      "cards": [
        {
          "card_id": "KfYeym50vtsF93LMsIFW",
          "title": "Silver Price",
          "embed_url": "https://tako.com/embed/KfYeym50vtsF93LMsIFW/",
          "webpage_url": "https://tako.com/card/KfYeym50vtsF93LMsIFW/",
          "image_url": "https://tako.com/api/v1/get_image/KfYeym50vtsF93LMsIFW/"
        }
      ],
      "web_results": [],
      "request_id": "f20f965b-6bbd-40df-b50b-a8861f34df24"
    }
  }
  ```
</Accordion>

### Agent

`tako-agent` runs deep research, which can take minutes. You have three ways to run it:

* **Stream** it (`stream=True`) — recommended; see [streaming events](#streaming-events) below.
* Run it in the **background** (`background=True`) and poll `GET /responses/{id}`.
* Call it **synchronously** — Tako drains the run inline and returns `504` (`run_timeout`, carrying the response id) past \~290 seconds.

`tako-agent` responses are **stateful**: pass a prior response's id as `previous_response_id` to continue the thread. (Both `previous_response_id` and `background` are rejected with `400` for `tako-answer`.)

<CodeGroup>
  ```python Python theme={null}
  # Background run, then poll.
  response = client.responses.create(
      model="tako-agent",
      input="Compare Nvidia and AMD revenue since 2015",
      background=True,
  )
  response = client.responses.retrieve(response.id)   # queued -> in_progress -> completed

  # Stateful follow-up.
  followup = client.responses.create(
      model="tako-agent",
      input="Now add Intel to that comparison",
      previous_response_id=response.id,
  )
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/responses \
    -H "Authorization: Bearer $TAKO_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "tako-agent",
      "input": "Compare Nvidia and AMD revenue since 2015",
      "background": true
    }'
  ```
</CodeGroup>

### Retrieving a response

`GET /responses/{id}` fetches a stored `tako-agent` response — useful for polling a background run (answers are stateless and aren't stored, so only agent ids are retrievable; access is owner-scoped). Add `stream=true` (or an SSE `Accept` header) to **replay** the run's event stream from the start.

```python Python theme={null}
response = client.responses.retrieve("resp_...")
print(response.status, response.output_text)
```

### Streaming events

The Responses stream uses **named SSE events** and — unlike Chat Completions — has **no `[DONE]` sentinel**. Terminate on `response.completed` or `response.failed`. Answer text arrives in `response.output_text.delta` events; `tako-agent` narration streams first as `response.reasoning_summary_text.delta` events. The **Tako cards and web results arrive once, on the final `response.completed` event** (in `response.tako`) — they are not streamed incrementally.

<CodeGroup>
  ```python Python theme={null}
  stream = client.responses.create(
      model="tako-agent",
      input="Compare Nvidia and AMD revenue since 2015",
      stream=True,
  )

  cards = []
  for event in stream:
      if event.type == "response.output_text.delta":
          print(event.delta, end="", flush=True)
      elif event.type == "response.completed":
          cards = event.response.tako["cards"]   # cards arrive once, at the end
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/responses \
    -H "Authorization: Bearer $TAKO_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "tako-agent",
      "input": "Compare Nvidia and AMD revenue since 2015",
      "stream": true
    }'
  ```
</CodeGroup>

<Note>
  The event order is `response.created` → `response.output_item.added` → `response.content_part.added` → `response.output_text.delta` (repeated) → `response.output_text.done` → `response.content_part.done` → `response.output_item.done` → `response.completed`. A failed run emits a single `response.failed` event (carrying `error`) and **no** `response.completed`.
</Note>

## The `tako` extension

Tako's value-add — the **cards** (interactive charts with their underlying data), the **web results**, and the **`request_id`** — rides on a namespaced `tako` object, never inside the text:

* **Chat Completions:** `choices[0].message.tako` (non-streaming), or a dedicated `choices[0].delta.tako` chunk (streaming).
* **Responses:** the top-level `response.tako` (non-streaming), or the `response.completed` event's `response.tako` (streaming).

The official OpenAI SDK preserves unknown response fields, so `tako` is available with no custom client. Each card carries `title`, `description`, `embed_url`, `image_url`, `webpage_url`, and `sources` — everything you need to embed the interactive chart or cite the data. See [Knowledge Cards](/documentation/getting-started/what-is-tako/knowledge-cards).

```json theme={null}
{
  "cards": [
    {"title": "Silver Price", "embed_url": "https://tako.com/embed/...", "webpage_url": "https://tako.com/card/...", "sources": []}
  ],
  "web_results": [
    {"title": "...", "url": "https://..."}
  ],
  "request_id": "f20f965b-6bbd-40df-b50b-a8861f34df24"
}
```

## Listing models

`GET /models` lists the available Tako models:

<CodeGroup>
  ```python Python theme={null}
  for model in client.models.list():
      print(model.id)   # tako-answer, tako-agent
  ```

  ```bash cURL theme={null}
  curl https://tako.com/api/openai/v1/models \
    -H "Authorization: Bearer $TAKO_API_KEY"
  ```
</CodeGroup>

## Compatibility

The gateway maps OpenAI's APIs onto Tako's query pipeline, so a few OpenAI behaviors don't carry over:

* **Chat Completions is single-shot and stateless.** Only the **last `user` message** becomes the Tako query; system messages and prior turns are ignored. For stateful, multi-turn agent conversations, use the **Responses API** (`previous_response_id`, `tako-agent` only) or the native [Agent API](/documentation/integrating-tako/guides/agent).
* **`source_indexes` is respected** — see [Choosing sources](#choosing-sources). Other request knobs are accepted for SDK compatibility and **ignored**: `temperature`, `top_p`, `tools`, `tool_choice`, `response_format`, and (on Responses) `instructions` and `store`. `n > 1` is rejected.
* **`usage` is always zero.** Tako bills per request, not per token — see [pricing](https://tako.com/pricing).
* **The Responses stream has no `[DONE]` marker.** Terminate on `response.completed` or `response.failed`.

<Tip>
  Want Tako *inside* an existing OpenAI agent instead of as the model itself? See [OpenAI Tool Calling](/documentation/integrations/openai-tool-calling) to register Tako as a function the model can call across turns.
</Tip>
