Compare

Type-safe, Pythonic agent framework from the Pydantic team.

Kitaru vs Pydantic AI: harness and runtime, composed

Pydantic AI is how the agent thinks. Kitaru is how it runs reliably over time and infrastructure. Use both — there's a first-class adapter.

pip install kitaru
Book a demo Read the docs

Pydantic AI and Kitaru solve different problems at different layers. Pydantic AI is an agent harness: it’s the best-in-class Pythonic library for writing type-safe agent logic — tool calling, structured outputs, dependency injection, streaming. Kitaru is a durable runtime: checkpoints, replay, resume, wait states, versioned deployments, artifact lineage, and self-hosted execution.

They don’t compete. They compose. Kitaru ships a first-class Pydantic AI adapter that tracks model requests and tool calls as child events under the enclosing checkpoint.

Kitaru

Use Kitaru if you are

  • Running Pydantic AI agents as long-lived services that must survive crashes and pod evictions
  • Processing enough volume that replaying a failed step is cheaper than re-running the whole agent
  • Deploying into your own cloud (Kubernetes, AWS, GCP, Azure) for security or compliance
  • Letting different teams pick different harnesses while standardizing durable execution underneath
  • Versioning and rolling deployments of agent flows with tag-based routing
Pydantic AI

Use Pydantic AI if you are

  • Writing agent logic in Python and want type-safe inputs, outputs, and tools
  • Building short-lived or interactive agents, or using Pydantic AI's own durable execution integrations (Temporal, DBOS, Prefect, Restate) where they fit your stack
  • Prototyping an agent before deciding whether it needs a production runtime around it
  • Happy with your existing orchestration and only need better agent ergonomics
Pydantic AI gives you a great way to write an agent. Kitaru gives you a great way to run one.

Different questions

Pydantic AI is asking: how do I write a typed, ergonomic agent loop with first-class tools and structured outputs? Kitaru is asking: once this agent exists, how does it survive crashes, resume after a human approves, replay from failure, version its deployments, and plug into my infra?

In practice that means Pydantic AI lives inside a Kitaru checkpoint. Kitaru wraps the outer boundaries — where durability, replay, and execution placement matter.

Pydantic AI · harness How the agent thinks
How do I write a typed, ergonomic agent loop with first-class tools and structured outputs?
typed I/O tool calls structured outputs streaming
Scope: a single agent invocation.
Kitaru · runtime How the agent runs over time and infra
Once this agent exists, how does it survive crashes, resume after a human approves, replay from failure, version its deployments, and plug into my infra?
durability replay wait / resume versioning placement
Scope: a long-lived production workload.

Compose, don’t replace

Kitaru doesn’t ask you to give up the Pydantic AI agent you already wrote. Wrap it in a checkpoint and you’re done.

The adapter tracks every model request and tool call as a child event under the enclosing checkpoint, so replay, artifact lineage, and logs work at the grain of the agent’s internal steps — not just the outer call.

from kitaru import flow, checkpoint, wait
from kitaru.adapters.pydantic_ai import KitaruAgent
from pydantic_ai import Agent

agent = KitaruAgent(
  Agent("openai:gpt-5.4", system_prompt="You're a compliance reviewer."),
)

@flow
def review(case: dict) -> str:
  first_pass = agent.run_sync(case).output
  approved = wait(name="approve", question=first_pass, schema=bool)
  return first_pass if approved else "rejected"

What Kitaru adds on top

Pydantic AI is primarily an agent harness — it now documents durable execution through integrations with Temporal, DBOS, Prefect, and Restate. Kitaru’s difference is that it ships the durable runtime layer Kitaru-natively:

Kitaru · runtime Adds on top — no agent rewrite
durable execution crash → replay · cached checkpoints, no re-burn
kitaru.wait() pause, release compute, resume hours later
flow.deploy() immutable versioned snapshots · tag routing
artifact lineage typed artifacts per checkpoint · cross-run diff
@checkpoint(runtime="isolated") per-step pod / job on K8s, AWS, GCP, Azure
self-hosted Helm service · artifacts in your own bucket
Pydantic AI · harness Your existing agent, unchanged
KitaruAgent(Agent(...)) first-class adapter · model + tool calls as child events
  • Durable execution. A crash, pod eviction, or timeout doesn’t send the run back to zero. Fix the bug, replay, and the completed checkpoints return cached output instead of re-burning tokens.
  • Pause and resume. kitaru.wait() suspends a flow, releases compute, and resumes minutes, hours, or days later when input lands from a human, another agent, a webhook, or a CLI call.
  • Versioned deployments. flow.deploy() captures an immutable snapshot. Consumers invoke by flow name; tag routing and rollback are a kitaru flow tag away.
  • Artifact lineage. Every checkpoint writes a typed, versioned artifact. Diff artifacts across runs, trace a bad output back to the specific step, and build audit trails without grepping logs.
  • Execution placement. @checkpoint(runtime="isolated") runs a specific step in its own pod or job on Kubernetes, AWS, GCP, Azure. Heavy or risky steps stay isolated; orchestration stays inline.
  • Self-hosted infrastructure. The Kitaru server is a single Helm-deployed service. Artifacts and state live in your own S3 / GCS / Azure Blob bucket. No mandatory SaaS control plane.

What makes Kitaru unique

Feature Kitaru Pydantic AI
Typed agent inputs, outputs, tools Not supported Yes
Structured model outputs Not supported Yes
Dependency injection for tools Not supported Yes
Durable execution + replay from checkpoint Yes Not supported
Pause/resume with compute released Yes Not supported
Versioned, invocable deployments with tag routing Yes Not supported
Artifact lineage across runs Yes Not supported
Per-checkpoint isolated runtime on your stack Yes Not supported
First-class Kubernetes, AWS, GCP, Azure deployment Yes Not supported
Durable memory scopes (Python, CLI, MCP) Yes Not supported
MCP server for AI-assistant introspection Yes Not supported
First-class adapter for the other Yes Not supported

How the two surfaces map

ConceptPydantic AIKitaru
LayerAgent harness (how the agent thinks)Durable runtime (how it runs over time and infra)
Durable unitAgent run; durable execution via Temporal / DBOS / Prefect / Restate integrations@flow + @checkpoint (Kitaru-native)
CompositionStandalone Pythonic agentPydantic AI agent wrapped by KitaruAgent adapter inside a @checkpoint
Crash recoveryReplay from the last good checkpoint, cached work reused
Long wait on a humanDeferred tools + HITL approval patterns; durable pause/resume depends on the chosen runtime integrationkitaru.wait() (compute released, survives crashes)
ArtifactsTyped, versioned artifacts per checkpoint
VersioningNamed versioned deployments with tag routing
DeploymentWhatever Python service you wrap the agent inStack-based deploy to Kubernetes, AWS, GCP, Azure

Code comparison

Pydantic AI + Kitaru Recommended
from kitaru import flow, checkpoint, wait
from kitaru.adapters.pydantic_ai import KitaruAgent
from pydantic_ai import Agent

reviewer = KitaruAgent(
  Agent("openai:gpt-5.4", system_prompt="You're a compliance reviewer."),
)

@checkpoint
def review_case(case: dict) -> str:
  # One PydanticAI run == one durable checkpoint.
  # Model calls + tool calls tracked as child events.
  return reviewer.run_sync(case).output

@flow
def review_flow(case: dict) -> str:
  draft = review_case(case)
  # Load the text for the human-facing wait question;
  # the raw checkpoint output still flows to the next step.
  draft_text = draft.load()
  ok = wait(name="approve", question=draft_text, schema=bool)
  return draft if ok else "rejected"

# Durable ad-hoc run
review_flow.run(case={"id": "C-001"})

# Or deploy as versioned snapshot, invoke by name
review_flow.deploy(case={"id": "C-001"})
review_flow.invoke(case={"id": "C-001"})
Pydantic AI alone
from pydantic_ai import Agent

reviewer = Agent(
  "openai:gpt-5.4",
  system_prompt="You're a compliance reviewer.",
)

def review_flow(case: dict) -> str:
  draft = reviewer.run_sync(case).output
  # Blocking input(). If the container dies,
  # the draft is lost.
  ok = input(f"Approve?\n{draft}\n[y/n]: ") == "y"
  return draft if ok else "rejected"

review_flow(case={"id": "C-001"})

Put a runtime under your Pydantic AI agents

If the Pydantic AI agent you wrote is still a notebook script or a short-lived interactive tool, Pydantic AI on its own is the right answer. If it’s becoming a production workload — long-running, crash-surviving, approved by a human hours later, deployed on your own cloud — the durability layer is the stuff Kitaru ships in the box.

pip install kitaru
Book a demo