Tracked LLM Calls
Use kitaru.llm() with model aliases, transported runtime config, and optional secret-backed credentials
kitaru.llm() lets you make a single tracked model call with automatic:
- prompt artifact capture
- response artifact capture
- usage/cost/latency metadata logging
If you want the full setup path from stored credentials to an actual flow run, start with Secrets + Model Registration.
Model selection order
When you call kitaru.llm(), Kitaru resolves the model in this order:
- the explicit
model=argument KITARU_DEFAULT_MODEL- the default alias from the effective model registry in the current environment
If KITARU_DEFAULT_MODEL matches a registered alias, Kitaru resolves that
alias. Otherwise it treats the value as a raw LiteLLM model string.
When you submit or replay a flow, Kitaru automatically transports your local
model registry into the execution environment. That means remote runs can still
resolve aliases with kitaru.llm() and kitaru model list. If
KITARU_MODEL_REGISTRY is already set in the runtime environment, its aliases
and default alias take precedence over matching local entries.
Register a model alias
kitaru model register fast --model openai/gpt-4o-mini --secret openai-credsYou can also register an alias without a linked secret:
kitaru model register fast --model openai/gpt-4o-miniList aliases with:
kitaru model listkitaru model register writes aliases to local Kitaru config, but submitted
and replayed runs automatically receive that registry as a transported runtime
snapshot. KITARU_MODEL_REGISTRY is available as an advanced manual override
for adding aliases or overriding matching ones.
Credential resolution order
For known providers such as OpenAI, Anthropic, and Gemini, Kitaru resolves credentials in this order:
- provider credentials already present in the environment
- the secret linked to the resolved alias
- otherwise, fail with a setup error
That means environment variables win over a linked secret for known providers.
Environment-backed setup
export OPENAI_API_KEY=sk-...Secret-backed setup
Store provider keys in a Kitaru secret:
kitaru secrets set openai-creds --OPENAI_API_KEY=sk-...When an alias includes --secret openai-creds, kitaru.llm() loads that
secret at runtime if the required environment variable is not already set.
Call kitaru.llm() inside a flow
from kitaru import flow
import kitaru
@flow
def writer(topic: str) -> str:
outline = kitaru.llm(
f"Create a 3-bullet outline about {topic}.",
model="fast",
name="outline_call",
)
return kitaru.llm(
f"Write a short paragraph using this outline:\n{outline}",
model="fast",
name="draft_call",
)Advanced options
kitaru.llm() also accepts system=, temperature=, and max_tokens=:
reply = kitaru.llm(
"Summarize this document in 3 bullets.",
model="fast",
system="You are a concise technical editor.",
temperature=0.2,
max_tokens=200,
name="summary_call",
)Chat-style message lists
Instead of a plain string, you can pass a chat-style message list:
reply = kitaru.llm(
[
{"role": "user", "content": "Draft a release note headline."},
{"role": "assistant", "content": "Kitaru adds durable replay controls."},
{"role": "user", "content": "Now make it shorter."},
],
model="fast",
name="headline_refine",
)Each message must include role and content keys. If system= is provided
alongside a message list, Kitaru prepends a system message automatically.
When to use kitaru.llm() vs your own client
kitaru.llm() is designed for simple text-in/text-out model calls. It handles
credential resolution, prompt/response capture, and cost tracking automatically.
For advanced patterns — tool calling, structured outputs, streaming, vision
inputs, or multi-turn conversation management — use your LLM client directly
inside a @checkpoint. You still get durable checkpointing and replay; you
just manage the model interaction yourself:
from litellm import completion
from kitaru import checkpoint
@checkpoint
def agent_step(messages: list[dict]) -> str:
resp = completion(
model="openai/gpt-4o-mini",
messages=messages,
tools=[...], # tool calling, structured output, etc.
)
return resp.choices[0].message.contentFor a full example of a tool-calling agent built this way, see
examples/coding_agent/.
Tool calling and structured output support for kitaru.llm() is on the
roadmap. For now, use your preferred LLM client inside checkpoints for
these patterns.
Runtime behavior by context
- Inside a flow (outside checkpoints):
kitaru.llm()runs as a synthetic durable call boundary. - Inside a checkpoint: it is tracked as a child event; the enclosing checkpoint remains the replay boundary.
What Kitaru records
Each kitaru.llm() call records:
- prompt artifacts
- response artifacts
- token usage
- latency
- cost metadata when available
- credential source metadata (
environmentorsecret)
Example in this repository
uv sync --extra local
# Register an alias (with or without a linked secret) before running the example.
uv run kitaru model register fast --model openai/gpt-4o-mini
uv run examples/llm/flow_with_llm.py
uv run pytest tests/test_phase12_llm_example.pyIf you want the full credential-backed setup path first, start with Secrets + Model Registration.
For the broader catalog, see Examples.