Overview

Kitaru is the runtime layer underneath your agent stack. It gives you durable execution for Python agents — checkpoints, replay, resume, wait(), versioned deployments — while the harness you already picked (Pydantic AI, Deep Agents, LangGraph, Claude Agent SDK, raw Python) keeps owning how the agent thinks, and your existing platform keeps owning auth, observability, and policy.

Kitaru is self-host-first: a single-service server on your own Kubernetes, artifacts in your own S3/GCS/Azure Blob. No mandatory SaaS control plane in the path of your agent’s data. See Harness, Runtime, Platform for the full picture of where Kitaru fits.

Create a durable agent

import kitaru
from kitaru import checkpoint, flow

@checkpoint
def research(topic: str) -> str:
    return kitaru.llm(f"Summarize {topic} in two sentences.")

@checkpoint
def draft_report(summary: str) -> str:
    return kitaru.llm(f"Write a short report based on: {summary}")

@flow
def research_agent(topic: str) -> str:
    summary = research(topic)
    return draft_report(summary)

if __name__ == "__main__":
    research_agent.run(topic="Why do AI agents need durable execution?")

Each @checkpoint is a durable unit of work — its output is persisted automatically. If the flow fails at draft_report, replaying it skips research and reuses its recorded result. kitaru.llm() logs model calls with prompt, response, tokens, and latency per call.

See the Quickstart to install and run this yourself.

What your agent can do with Kitaru

These are the runtime primitives Kitaru adds on top of your existing Python agent code. You keep your harness and your control flow; Kitaru makes the run durable.

Durable execution: Wrap steps in @checkpoint and your agent picks up where it left off without re-running expensive work
Replay from failure: Re-run only the failed part of a flow by replaying from a checkpoint instead of starting from scratch
Wait and resume: Add kitaru.wait() and let agents pause for a human, another system, or later input; after the polling timeout, compute is released and the run resumes when input lands
Durable memory: kitaru.memory stores scoped, versioned key-value state you can seed, inspect, compact, and reuse across executions
Artifact lineage: Every checkpoint output is written to your object store as a typed, versioned artifact — step through runs, diff outputs across runs, and trace a bad final output back to the exact step that produced it
Execution management: KitaruClient lets you inspect, replay, retry, resume, and cancel executions from code or CLI
Tracked LLM calls: Use kitaru.llm() and every call gets automatic secret resolution, prompt/response capture, and token/latency logging
Persistent data: kitaru.save() / kitaru.load() let agents store and retrieve files, objects, and results across executions
Structured observability: kitaru.log() attaches key-value metadata to any checkpoint or flow for debugging and the UI
Runtime configuration: kitaru.configure() sets your model, log store, and stack defaults in one call
Framework and infrastructure portability: Keep your Python control flow, use your preferred framework, and run locally or on remote stacks — Kubernetes, Vertex AI, SageMaker, AzureML

Create a durable agent

What your agent can do with Kitaru

Next Steps

Installation

Quickstart

Examples

Harness, Runtime, Platform

How It Works

Core Concepts

Execution Management

Memory

Wait, Input, and Resume

Tracked LLM Calls

Secrets + Model Registration

Configuration

Stacks

MCP Server

Claude Code Skills

CLI Reference

Blog

On this page