Kitaru
Guides

Tracked LLM Calls

Use kitaru.llm() with model aliases, transported runtime config, and optional secret-backed credentials

kitaru.llm() lets you make a single tracked model call with automatic:

  • prompt artifact capture
  • response artifact capture
  • usage/cost/latency metadata logging

If you want the full setup path from stored credentials to an actual flow run, start with Secrets + Model Registration.

Model selection order

When you call kitaru.llm(), Kitaru resolves the model in this order:

  1. the explicit model= argument
  2. KITARU_DEFAULT_MODEL
  3. the default alias from the effective model registry in the current environment

If KITARU_DEFAULT_MODEL matches a registered alias, Kitaru resolves that alias. Otherwise it treats the value as a raw LiteLLM model string.

When you submit or replay a flow, Kitaru automatically transports your local model registry into the execution environment. That means remote runs can still resolve aliases with kitaru.llm() and kitaru model list. If KITARU_MODEL_REGISTRY is already set in the runtime environment, its aliases and default alias take precedence over matching local entries.

Register a model alias

kitaru model register fast --model openai/gpt-4o-mini --secret openai-creds

You can also register an alias without a linked secret:

kitaru model register fast --model openai/gpt-4o-mini

List aliases with:

kitaru model list

kitaru model register writes aliases to local Kitaru config, but submitted and replayed runs automatically receive that registry as a transported runtime snapshot. KITARU_MODEL_REGISTRY is available as an advanced manual override for adding aliases or overriding matching ones.

Credential resolution order

For known providers such as OpenAI, Anthropic, and Gemini, Kitaru resolves credentials in this order:

  1. provider credentials already present in the environment
  2. the secret linked to the resolved alias
  3. otherwise, fail with a setup error

That means environment variables win over a linked secret for known providers.

Environment-backed setup

export OPENAI_API_KEY=sk-...

Secret-backed setup

Store provider keys in a Kitaru secret:

kitaru secrets set openai-creds --OPENAI_API_KEY=sk-...

When an alias includes --secret openai-creds, kitaru.llm() loads that secret at runtime if the required environment variable is not already set.

Call kitaru.llm() inside a flow

from kitaru import flow
import kitaru

@flow
def writer(topic: str) -> str:
    outline = kitaru.llm(
        f"Create a 3-bullet outline about {topic}.",
        model="fast",
        name="outline_call",
    )
    return kitaru.llm(
        f"Write a short paragraph using this outline:\n{outline}",
        model="fast",
        name="draft_call",
    )

Advanced options

kitaru.llm() also accepts system=, temperature=, and max_tokens=:

reply = kitaru.llm(
    "Summarize this document in 3 bullets.",
    model="fast",
    system="You are a concise technical editor.",
    temperature=0.2,
    max_tokens=200,
    name="summary_call",
)

Chat-style message lists

Instead of a plain string, you can pass a chat-style message list:

reply = kitaru.llm(
    [
        {"role": "user", "content": "Draft a release note headline."},
        {"role": "assistant", "content": "Kitaru adds durable replay controls."},
        {"role": "user", "content": "Now make it shorter."},
    ],
    model="fast",
    name="headline_refine",
)

Each message must include role and content keys. If system= is provided alongside a message list, Kitaru prepends a system message automatically.

When to use kitaru.llm() vs your own client

kitaru.llm() is designed for simple text-in/text-out model calls. It handles credential resolution, prompt/response capture, and cost tracking automatically.

For advanced patterns — tool calling, structured outputs, streaming, vision inputs, or multi-turn conversation management — use your LLM client directly inside a @checkpoint. You still get durable checkpointing and replay; you just manage the model interaction yourself:

from litellm import completion
from kitaru import checkpoint

@checkpoint
def agent_step(messages: list[dict]) -> str:
    resp = completion(
        model="openai/gpt-4o-mini",
        messages=messages,
        tools=[...],  # tool calling, structured output, etc.
    )
    return resp.choices[0].message.content

For a full example of a tool-calling agent built this way, see examples/coding_agent/.

Tool calling and structured output support for kitaru.llm() is on the roadmap. For now, use your preferred LLM client inside checkpoints for these patterns.

Runtime behavior by context

  • Inside a flow (outside checkpoints): kitaru.llm() runs as a synthetic durable call boundary.
  • Inside a checkpoint: it is tracked as a child event; the enclosing checkpoint remains the replay boundary.

What Kitaru records

Each kitaru.llm() call records:

  • prompt artifacts
  • response artifacts
  • token usage
  • latency
  • cost metadata when available
  • credential source metadata (environment or secret)

Example in this repository

uv sync --extra local

# Register an alias (with or without a linked secret) before running the example.
uv run kitaru model register fast --model openai/gpt-4o-mini
uv run examples/llm/flow_with_llm.py
uv run pytest tests/test_phase12_llm_example.py

If you want the full credential-backed setup path first, start with Secrets + Model Registration.

For the broader catalog, see Examples.

On this page