Skip to main content
Tako exposes its Answer and Agent products through an OpenAI-compatible gateway. Point the official OpenAI SDK at Tako with just a base_url and a model — no SDK swap, no new client — and get answers grounded in real-time, trusted data alongside the Tako cards and web results that back them.
EndpointOpenAI interfaceModelsUse case
/chat/completionsChat Completions APItako-answer, tako-agentSingle-shot, stateless requests.
/responsesResponses APItako-answer, tako-agentEvent-based; supports stateful tako-agent follow-ups.
This is a wire-compatible gateway over Answer and the Agent, not a chat model. A tako-answer request is a stateless, single-shot query; tako-agent runs deep research and, on the Responses API, supports stateful follow-ups. A few OpenAI behaviors don’t carry over — see Compatibility.

Overview

Set two things on your OpenAI client:
  • base_urlhttps://tako.com/api/openai/v1
  • api_key → your Tako API key (sent as the Authorization: Bearer header)
Then choose a model:
ModelBacked byBest for
tako-answerAnswerA fast, synchronous grounded answer.
tako-agentAgentDeep research over Tako’s knowledge graph and the live web (slower — stream it).
pip install openai
import os
from openai import OpenAI

client = OpenAI(
    base_url="https://tako.com/api/openai/v1",
    api_key=os.environ["TAKO_API_KEY"],
)

Choosing sources

By default the gateway grounds answers in both Tako’s curated knowledge and the live web. Pass source_indexes to restrict that — an array containing "data", "web", or both. It applies to both APIs and both models. Since it isn’t a native OpenAI field, pass it through extra_body with the SDK (or as a top-level field in a raw request).
completion = client.chat.completions.create(
    model="tako-answer",
    messages=[{"role": "user", "content": "What is the price of silver?"}],
    extra_body={"source_indexes": ["data"]},   # Tako curated data only; omit to use both
)
Allowed values are "data" and "web". The legacy value "tako" is accepted as a synonym for "data". The list must be non-empty — an empty or invalid list returns 400. On the Responses API, pass source_indexes the same way.

Chat Completions

Answer

tako-answer returns a synthesized answer in one synchronous call — the OpenAI equivalent of Answer. The prose lands in message.content; the Tako cards and web results that back it ride a namespaced tako extension on the message.
completion = client.chat.completions.create(
    model="tako-answer",
    messages=[{"role": "user", "content": "What is the price of silver?"}],
)

message = completion.choices[0].message
print(message.content)                 # the grounded answer
for card in message.tako["cards"]:      # the backing Tako cards
    print(card["title"], card["embed_url"])
A standard chat.completion. The answer is in choices[0].message.content; the backing cards and web results are in choices[0].message.tako.
{
  "id": "chatcmpl-2b7a4f0a9c2e7b1a2c3d4e5f",
  "object": "chat.completion",
  "created": 1750000000,
  "model": "tako-answer",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Silver is trading at about $34.80 per troy ounce ...",
        "tako": {
          "cards": [
            {
              "card_id": "KfYeym50vtsF93LMsIFW",
              "title": "Silver Price",
              "description": "Spot price of silver per troy ounce in USD.",
              "embed_url": "https://tako.com/embed/KfYeym50vtsF93LMsIFW/",
              "webpage_url": "https://tako.com/card/KfYeym50vtsF93LMsIFW/",
              "image_url": "https://tako.com/api/v1/get_image/KfYeym50vtsF93LMsIFW/",
              "sources": [
                {"source_name": "Tako Markets", "source_index": "tako"}
              ]
            }
          ],
          "web_results": [],
          "request_id": "f20f965b-6bbd-40df-b50b-a8861f34df24"
        }
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}
}

Agent

tako-agent runs the Agent: deep research over Tako’s curated knowledge and the live web. A run can take minutes, so prefer streaming. A non-streaming call works too, but Tako holds the connection open until the run finishes — and returns 504 (run_timeout, carrying the run_id) if the run overruns ~290 seconds.
completion = client.chat.completions.create(
    model="tako-agent",
    messages=[{"role": "user", "content": "Compare Nvidia and AMD revenue since 2015"}],
)
print(completion.choices[0].message.content)

Streaming

Set stream=True to receive the answer as it’s produced — recommended for tako-agent. The text arrives in delta.content chunks; the Tako cards and web results arrive in a single dedicated delta.tako chunk near the end of the stream.
stream = client.chat.completions.create(
    model="tako-agent",
    messages=[{"role": "user", "content": "Compare Nvidia and AMD revenue since 2015"}],
    stream=True,
)

cards = []
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)
    tako = getattr(delta, "tako", None)   # the dedicated cards chunk
    if tako:
        cards = tako["cards"]
For token-style usage accounting, pass stream_options={"include_usage": True} to emit a final usage frame before [DONE] (its values are zero — see below).
If a run fails mid-stream, the gateway emits an error chunk and closes the stream without a [DONE] marker. Treat a stream that ends without [DONE] as a failed run.

Responses API

OpenAI’s Responses API is the newer, event-based interface — POST /responses. Same models, auth, and tako extension; the differences are the request field (input instead of messages), the response shape (an output array), and — for tako-agentstateful follow-ups, background runs, and a richer streaming event set. input accepts either a plain string or the OpenAI items array; Tako uses the last user turn as the query. instructions is accepted but not applied.

Answer

tako-answer is synchronous — one request, one grounded answer. The text is in output[].content[].text (the SDK aggregates it into response.output_text); cards and web results ride the top-level response.tako.
response = client.responses.create(
    model="tako-answer",
    input="What is the price of silver?",
)

print(response.output_text)             # the grounded answer
for card in response.tako["cards"]:      # the backing Tako cards
    print(card["title"], card["embed_url"])
A standard response object. The text is in output[0].content[0].text; the backing cards and web results are in the top-level tako key.
{
  "id": "resp_8f0a9c2e7b1a2c3d",
  "object": "response",
  "created_at": 1750000000,
  "model": "tako-answer",
  "status": "completed",
  "output": [
    {
      "type": "message",
      "id": "msg_8f0a9c2e7b1a2c3d",
      "role": "assistant",
      "status": "completed",
      "content": [
        {"type": "output_text", "text": "Silver is trading at about $34.80 per troy ounce ...", "annotations": []}
      ]
    }
  ],
  "usage": {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0},
  "tako": {
    "cards": [
      {
        "card_id": "KfYeym50vtsF93LMsIFW",
        "title": "Silver Price",
        "embed_url": "https://tako.com/embed/KfYeym50vtsF93LMsIFW/",
        "webpage_url": "https://tako.com/card/KfYeym50vtsF93LMsIFW/",
        "image_url": "https://tako.com/api/v1/get_image/KfYeym50vtsF93LMsIFW/"
      }
    ],
    "web_results": [],
    "request_id": "f20f965b-6bbd-40df-b50b-a8861f34df24"
  }
}

Agent

tako-agent runs deep research, which can take minutes. You have three ways to run it:
  • Stream it (stream=True) — recommended; see streaming events below.
  • Run it in the background (background=True) and poll GET /responses/{id}.
  • Call it synchronously — Tako drains the run inline and returns 504 (run_timeout, carrying the response id) past ~290 seconds.
tako-agent responses are stateful: pass a prior response’s id as previous_response_id to continue the thread. (Both previous_response_id and background are rejected with 400 for tako-answer.)
# Background run, then poll.
response = client.responses.create(
    model="tako-agent",
    input="Compare Nvidia and AMD revenue since 2015",
    background=True,
)
response = client.responses.retrieve(response.id)   # queued -> in_progress -> completed

# Stateful follow-up.
followup = client.responses.create(
    model="tako-agent",
    input="Now add Intel to that comparison",
    previous_response_id=response.id,
)

Retrieving a response

GET /responses/{id} fetches a stored tako-agent response — useful for polling a background run (answers are stateless and aren’t stored, so only agent ids are retrievable; access is owner-scoped). Add stream=true (or an SSE Accept header) to replay the run’s event stream from the start.
Python
response = client.responses.retrieve("resp_...")
print(response.status, response.output_text)

Streaming events

The Responses stream uses named SSE events and — unlike Chat Completions — has no [DONE] sentinel. Terminate on response.completed or response.failed. Answer text arrives in response.output_text.delta events; tako-agent narration streams first as response.reasoning_summary_text.delta events. The Tako cards and web results arrive once, on the final response.completed event (in response.tako) — they are not streamed incrementally.
stream = client.responses.create(
    model="tako-agent",
    input="Compare Nvidia and AMD revenue since 2015",
    stream=True,
)

cards = []
for event in stream:
    if event.type == "response.output_text.delta":
        print(event.delta, end="", flush=True)
    elif event.type == "response.completed":
        cards = event.response.tako["cards"]   # cards arrive once, at the end
The event order is response.createdresponse.output_item.addedresponse.content_part.addedresponse.output_text.delta (repeated) → response.output_text.doneresponse.content_part.doneresponse.output_item.doneresponse.completed. A failed run emits a single response.failed event (carrying error) and no response.completed.

The tako extension

Tako’s value-add — the cards (interactive charts with their underlying data), the web results, and the request_id — rides on a namespaced tako object, never inside the text:
  • Chat Completions: choices[0].message.tako (non-streaming), or a dedicated choices[0].delta.tako chunk (streaming).
  • Responses: the top-level response.tako (non-streaming), or the response.completed event’s response.tako (streaming).
The official OpenAI SDK preserves unknown response fields, so tako is available with no custom client. Each card carries title, description, embed_url, image_url, webpage_url, and sources — everything you need to embed the interactive chart or cite the data. See Knowledge Cards.
{
  "cards": [
    {"title": "Silver Price", "embed_url": "https://tako.com/embed/...", "webpage_url": "https://tako.com/card/...", "sources": []}
  ],
  "web_results": [
    {"title": "...", "url": "https://..."}
  ],
  "request_id": "f20f965b-6bbd-40df-b50b-a8861f34df24"
}

Listing models

GET /models lists the available Tako models:
for model in client.models.list():
    print(model.id)   # tako-answer, tako-agent

Compatibility

The gateway maps OpenAI’s APIs onto Tako’s query pipeline, so a few OpenAI behaviors don’t carry over:
  • Chat Completions is single-shot and stateless. Only the last user message becomes the Tako query; system messages and prior turns are ignored. For stateful, multi-turn agent conversations, use the Responses API (previous_response_id, tako-agent only) or the native Agent API.
  • source_indexes is respected — see Choosing sources. Other request knobs are accepted for SDK compatibility and ignored: temperature, top_p, tools, tool_choice, response_format, and (on Responses) instructions and store. n > 1 is rejected.
  • usage is always zero. Tako bills per request, not per token — see pricing.
  • The Responses stream has no [DONE] marker. Terminate on response.completed or response.failed.
Want Tako inside an existing OpenAI agent instead of as the model itself? See OpenAI Tool Calling to register Tako as a function the model can call across turns.