Pydantic AI and Kitaru solve different problems at different layers. Pydantic AI is an agent harness: it’s the best-in-class Pythonic library for writing type-safe agent logic — tool calling, structured outputs, dependency injection, streaming. Kitaru is a durable runtime: checkpoints, replay, resume, wait states, versioned deployments, artifact lineage, and self-hosted execution.
They don’t compete. They compose. Kitaru ships a first-class Pydantic AI adapter that tracks model requests and tool calls as child events under the enclosing checkpoint.
Use Kitaru if you are
- Running Pydantic AI agents as long-lived services that must survive crashes and pod evictions
- Processing enough volume that replaying a failed step is cheaper than re-running the whole agent
- Deploying into your own cloud (Kubernetes, AWS, GCP, Azure) for security or compliance
- Letting different teams pick different harnesses while standardizing durable execution underneath
- Versioning and rolling deployments of agent flows with tag-based routing
Use Pydantic AI if you are
- Writing agent logic in Python and want type-safe inputs, outputs, and tools
- Building short-lived or interactive agents, or using Pydantic AI's own durable execution integrations (Temporal, DBOS, Prefect, Restate) where they fit your stack
- Prototyping an agent before deciding whether it needs a production runtime around it
- Happy with your existing orchestration and only need better agent ergonomics
Pydantic AI gives you a great way to write an agent. Kitaru gives you a great way to run one.
Different questions
Pydantic AI is asking: how do I write a typed, ergonomic agent loop with first-class tools and structured outputs? Kitaru is asking: once this agent exists, how does it survive crashes, resume after a human approves, replay from failure, version its deployments, and plug into my infra?
In practice that means Pydantic AI lives inside a Kitaru checkpoint. Kitaru wraps the outer boundaries — where durability, replay, and execution placement matter.
Compose, don’t replace
Kitaru doesn’t ask you to give up the Pydantic AI agent you already wrote. Wrap it in a checkpoint and you’re done.
The adapter tracks every model request and tool call as a child event under the enclosing checkpoint, so replay, artifact lineage, and logs work at the grain of the agent’s internal steps — not just the outer call.
from kitaru import flow, checkpoint, wait
from kitaru.adapters.pydantic_ai import KitaruAgent
from pydantic_ai import Agent
agent = KitaruAgent(
Agent("openai:gpt-5.4", system_prompt="You're a compliance reviewer."),
)
@flow
def review(case: dict) -> str:
first_pass = agent.run_sync(case).output
approved = wait(name="approve", question=first_pass, schema=bool)
return first_pass if approved else "rejected" What Kitaru adds on top
Pydantic AI is primarily an agent harness — it now documents durable execution through integrations with Temporal, DBOS, Prefect, and Restate. Kitaru’s difference is that it ships the durable runtime layer Kitaru-natively:
- Durable execution. A crash, pod eviction, or timeout doesn’t send the run back to zero. Fix the bug, replay, and the completed checkpoints return cached output instead of re-burning tokens.
- Pause and resume.
kitaru.wait()suspends a flow, releases compute, and resumes minutes, hours, or days later when input lands from a human, another agent, a webhook, or a CLI call. - Versioned deployments.
flow.deploy()captures an immutable snapshot. Consumers invoke by flow name; tag routing and rollback are akitaru flow tagaway. - Artifact lineage. Every checkpoint writes a typed, versioned artifact. Diff artifacts across runs, trace a bad output back to the specific step, and build audit trails without grepping logs.
- Execution placement.
@checkpoint(runtime="isolated")runs a specific step in its own pod or job on Kubernetes, AWS, GCP, Azure. Heavy or risky steps stay isolated; orchestration stays inline. - Self-hosted infrastructure. The Kitaru server is a single Helm-deployed service. Artifacts and state live in your own S3 / GCS / Azure Blob bucket. No mandatory SaaS control plane.
What makes Kitaru unique
| Feature | Kitaru | Pydantic AI |
|---|---|---|
| Typed agent inputs, outputs, tools | Not supported | Yes |
| Structured model outputs | Not supported | Yes |
| Dependency injection for tools | Not supported | Yes |
| Durable execution + replay from checkpoint | Yes | Not supported |
| Pause/resume with compute released | Yes | Not supported |
| Versioned, invocable deployments with tag routing | Yes | Not supported |
| Artifact lineage across runs | Yes | Not supported |
| Per-checkpoint isolated runtime on your stack | Yes | Not supported |
| First-class Kubernetes, AWS, GCP, Azure deployment | Yes | Not supported |
| Durable memory scopes (Python, CLI, MCP) | Yes | Not supported |
| MCP server for AI-assistant introspection | Yes | Not supported |
| First-class adapter for the other | Yes | Not supported |
How the two surfaces map
| Concept | Pydantic AI | Kitaru |
|---|---|---|
| Layer | Agent harness (how the agent thinks) | Durable runtime (how it runs over time and infra) |
| Durable unit | Agent run; durable execution via Temporal / DBOS / Prefect / Restate integrations | @flow + @checkpoint (Kitaru-native) |
| Composition | Standalone Pythonic agent | Pydantic AI agent wrapped by KitaruAgent adapter inside a @checkpoint |
| Crash recovery | — | Replay from the last good checkpoint, cached work reused |
| Long wait on a human | Deferred tools + HITL approval patterns; durable pause/resume depends on the chosen runtime integration | kitaru.wait() (compute released, survives crashes) |
| Artifacts | — | Typed, versioned artifacts per checkpoint |
| Versioning | — | Named versioned deployments with tag routing |
| Deployment | Whatever Python service you wrap the agent in | Stack-based deploy to Kubernetes, AWS, GCP, Azure |
Code comparison
from kitaru import flow, checkpoint, wait
from kitaru.adapters.pydantic_ai import KitaruAgent
from pydantic_ai import Agent
reviewer = KitaruAgent(
Agent("openai:gpt-5.4", system_prompt="You're a compliance reviewer."),
)
@checkpoint
def review_case(case: dict) -> str:
# One PydanticAI run == one durable checkpoint.
# Model calls + tool calls tracked as child events.
return reviewer.run_sync(case).output
@flow
def review_flow(case: dict) -> str:
draft = review_case(case)
# Load the text for the human-facing wait question;
# the raw checkpoint output still flows to the next step.
draft_text = draft.load()
ok = wait(name="approve", question=draft_text, schema=bool)
return draft if ok else "rejected"
# Durable ad-hoc run
review_flow.run(case={"id": "C-001"})
# Or deploy as versioned snapshot, invoke by name
review_flow.deploy(case={"id": "C-001"})
review_flow.invoke(case={"id": "C-001"}) from pydantic_ai import Agent
reviewer = Agent(
"openai:gpt-5.4",
system_prompt="You're a compliance reviewer.",
)
def review_flow(case: dict) -> str:
draft = reviewer.run_sync(case).output
# Blocking input(). If the container dies,
# the draft is lost.
ok = input(f"Approve?\n{draft}\n[y/n]: ") == "y"
return draft if ok else "rejected"
review_flow(case={"id": "C-001"}) Put a runtime under your Pydantic AI agents
If the Pydantic AI agent you wrote is still a notebook script or a short-lived interactive tool, Pydantic AI on its own is the right answer. If it’s becoming a production workload — long-running, crash-surviving, approved by a human hours later, deployed on your own cloud — the durability layer is the stuff Kitaru ships in the box.
pip install kitaru