From the makers of ZenML Open source · Apache 2.0

Everything around the agent loop.

Deploy, track, and govern autonomous agents on your cloud. Any framework. Any model.

pip install kitaru
Kitaru demo video preview Watch the demo
INSIDE VS OUTSIDE

The agent loop is solved. The platform around it isn't.

Agent SDKs solve the inner loop: tool calls, LLM calls, sandboxes. That's the easy part now.
Everything in the outer loop (how the agent is deployed, what state it remembers, how it's dispatched, how runs are tracked) gets rebuilt from scratch, company by company.

Durable agent memory
Memory

State your agents can come back to.

Versioned, scoped memory that persists across executions. Write it from Python, read it from the CLI or MCP. Agents remember conventions, context, and prior work across runs.

One agent, three runtimes
Orchestration

Dispatch from anywhere.

CLI, Python script, or HTTP. Define the agent once and call it from any entry point. Kitaru's runtime absorbs the differences so the same flow runs the same way.

Runs you can actually compare
Tracking

Inspect every run.

Every execution is versioned, every checkpoint persisted, every artifact tracked. Open the dashboard and see exactly what happened. Built in, not bolted on.

WORKS WITH YOUR AGENT SDK

One import. Any agent SDK.

Keep your agent, your sandbox, your manifest exactly as they are. Swap one import and get checkpointed execution, artifact tracking, and observability, without changing a line of your agent code.

Before agent.py
from agents import Runner
from agents.sandbox import SandboxAgent, Manifest

agent = SandboxAgent(
    name="Compliance Reviewer",
    model="gpt-5.4",
    default_manifest=manifest,
)

result = await Runner.run(agent, task)
With Kitaru agent.py
from kitaru.integrations.openai import KitaruRunner
from agents.sandbox import SandboxAgent, Manifest

agent = SandboxAgent(
    name="Compliance Reviewer",
    model="gpt-5.4",
    default_manifest=manifest,
)

result = await KitaruRunner.run(agent, task)
SEE THE CODE

Core primitives. Full durability.

agent.py
import kitaru
from kitaru import flow, checkpoint
 
kitaru.configure(stack="kubernetes")
 
@checkpoint
def research(topic: str) -> dict:
    results = search_web(topic)
    kitaru.save("sources", results)
    return summarize(results)
 
@checkpoint
def write_draft(context: str, prev_id: str) -> str:
    prior = kitaru.load(prev_id, "sources")
    return kitaru.llm(
        "Draft a report on: " + context,
        model="gpt-4o",
    )
 
@flow
def report_agent(topic: str, prev_id: str) -> str:
    data = research(topic)
    draft = write_draft(str(data), prev_id)
    kitaru.log(topic=topic, words=len(draft.split()))
 
    approved = kitaru.wait(
        schema=bool, question="Publish?"
    )
    if approved:
        publish(draft)
    return draft
@flow

Top-level orchestration boundary. Marks a function as a durable workflow.

@checkpoint

Persists output. Crash at step 3? Steps 1-2 never re-run.

kitaru.wait()

Suspends the process. Resume when a human responds, 30s or 3 days later.

kitaru.llm()

Resolves the model alias and injects the API key.

kitaru.log()

Structured metadata on every execution. Query it in the dashboard.

kitaru.save()

Persist any artifact by name inside a checkpoint.

kitaru.load()

Retrieve saved artifacts from any prior execution by ID.

kitaru.configure()

Set stack, project, and runtime defaults. Zero config locally.

CORE RUNTIME PRIMITIVES

The primitives long-running agents keep needing.

These are the runtime basics teams keep rebuilding once agents leave the laptop.

01 · Wait & Resume

Pause. Get input. Continue later.

Suspends at decision points, releases compute, and resumes when input arrives from a human, another agent, or a webhook, even hours or days later.

1
2
3
4
5
6
Waiting for input...
02 · Replay from Failure

Crash at step 6? Resume from step 6.

Every step is checkpointed. Fix the issue and replay from the point of failure instead of re-burning tokens.

1
2
3
4
5
6
7
Running agent...
03 · Framework Portability

Keep your framework. Add durable execution.

OpenAI Agents SDK, Anthropic Agent SDK, PydanticAI, LangGraph, or raw Python. Wrap it with Kitaru and get checkpointed execution without rewriting your agent.

kitaru
OpenAI Agents SDK
Anthropic Agent SDK
PydanticAI
LangGraph
04 · Parallel Recovery

Fan out work without losing recovery.

checkpoint.submit() dispatches branches concurrently. Each keeps its own checkpoint history, so you can replay only the failed branch.

THE PLATFORM LAYER

Your framework. Your model. Your sandbox.
Shipped, governed, observable.

Agent SDKs give you the loop. Sandboxes give you compute. Kitaru adds the layer in between (checkpoints, artifacts, replay, and memory) so your agent can run in production without rewriting it.

Your Agent Code
OpenAI Agents SDK Anthropic Agent SDK PydanticAI LangGraph Raw Python
Any framework, any model
Kitaru SDK
@flow @checkpoint wait() llm() log() save() load() memory.* configure()
Core primitives. Durable execution + memory.
Kitaru Engine
Checkpointer Artifact Store Deployer Replay Governance
Deploy, track, govern
Infrastructure
Kubernetes AWS / GCP / Azure S3 / GCS SQL Database
Your cloud
agent.py
import kitaru
from kitaru import flow, checkpoint

@flow
def coding_agent(issue: str) -> str:
    plan = analyze_issue(issue)
    patch = write_code(plan)

    # Pauses. Resumes when input arrives.
    approved = kitaru.wait(
        bool, question="Merge this PR?"
    )
    if approved:
        merge(patch)
    return patch
OPEN SOURCE

From laptop to enterprise. Same OSS stack.

A CLI, a dashboard, and a scalable execution server. All open source. Runs on your laptop and scales to your cloud.

uv add kitaru && kitaru login
Kitaru dashboard showing an execution with its checkpoints, logs, and cost
Dashboard, replay, and logs ship with the server.

Your agent crashed at step 5.
Stop re-running steps 1 through 4.

pip install kitaru

Open source (Apache 2.0). pip install and go.