Kitaru vs Pydantic AI: harness and runtime, composed

Pydantic AI and Kitaru solve different problems at different layers. Pydantic AI is an agent harness: it’s the best-in-class Pythonic library for writing type-safe agent logic — tool calling, structured outputs, dependency injection, streaming. Kitaru is a durable runtime: checkpoints, replay, resume, wait states, versioned deployments, artifact lineage, and self-hosted execution.

They don’t compete. They compose. Kitaru ships a first-class Pydantic AI adapter that tracks model requests and tool calls as child events under the enclosing checkpoint.

Kitaru

Use Kitaru if you are

Running Pydantic AI agents as long-lived services that must survive crashes and pod evictions
Processing enough volume that replaying a failed step is cheaper than re-running the whole agent
Deploying into your own cloud (Kubernetes, AWS, GCP, Azure) for security or compliance
Letting different teams pick different harnesses while standardizing durable execution underneath
Versioning and rolling deployments of agent flows with tag-based routing

Pydantic AI

Use Pydantic AI if you are

Writing agent logic in Python and want type-safe inputs, outputs, and tools
Building short-lived or interactive agents, or using Pydantic AI's own durable execution integrations (Temporal, DBOS, Prefect, Restate) where they fit your stack
Prototyping an agent before deciding whether it needs a production runtime around it
Happy with your existing orchestration and only need better agent ergonomics

Pydantic AI gives you a great way to write an agent. Kitaru gives you a great way to run one.

Different questions

Pydantic AI is asking: how do I write a typed, ergonomic agent loop with first-class tools and structured outputs? Kitaru is asking: once this agent exists, how does it survive crashes, resume after a human approves, replay from failure, version its deployments, and plug into my infra?

In practice that means Pydantic AI lives inside a Kitaru checkpoint. Kitaru wraps the outer boundaries — where durability, replay, and execution placement matter.

Pydantic AI · harness How the agent thinks

How do I write a typed, ergonomic agent loop with first-class tools and structured outputs?

typed I/O tool calls structured outputs streaming

Scope: a single agent invocation.

Kitaru · runtime How the agent runs over time and infra

Once this agent exists, how does it survive crashes, resume after a human approves, replay from failure, version its deployments, and plug into my infra?

durability replay wait / resume versioning placement

Scope: a long-lived production workload.

Compose, don’t replace

Kitaru doesn’t ask you to give up the Pydantic AI agent you already wrote. Wrap it in a checkpoint and you’re done.

The adapter tracks every model request and tool call as a child event under the enclosing checkpoint, so replay, artifact lineage, and logs work at the grain of the agent’s internal steps — not just the outer call.

from kitaru import flow, checkpoint, wait
from kitaru.adapters.pydantic_ai import KitaruAgent
from pydantic_ai import Agent

agent = KitaruAgent(
  Agent("openai:gpt-5.4", system_prompt="You're a compliance reviewer."),
)

@flow
def review(case: dict) -> str:
  first_pass = agent.run_sync(case).output
  approved = wait(name="approve", question=first_pass, schema=bool)
  return first_pass if approved else "rejected"

What Kitaru adds on top

Pydantic AI is primarily an agent harness — it now documents durable execution through integrations with Temporal, DBOS, Prefect, and Restate. Kitaru’s difference is that it ships the durable runtime layer Kitaru-natively:

Kitaru · runtime Adds on top — no agent rewrite

durable execution crash → replay · cached checkpoints, no re-burn

kitaru.wait() pause, release compute, resume hours later

flow.deploy() immutable versioned snapshots · tag routing

artifact lineage typed artifacts per checkpoint · cross-run diff

@checkpoint(runtime="isolated") per-step pod / job on K8s, AWS, GCP, Azure

self-hosted Helm service · artifacts in your own bucket

Pydantic AI · harness Your existing agent, unchanged

KitaruAgent(Agent(...)) first-class adapter · model + tool calls as child events

Durable execution. A crash, pod eviction, or timeout doesn’t send the run back to zero. Fix the bug, replay, and the completed checkpoints return cached output instead of re-burning tokens.
Pause and resume. kitaru.wait() suspends a flow, releases compute, and resumes minutes, hours, or days later when input lands from a human, another agent, a webhook, or a CLI call.
Versioned deployments. flow.deploy() captures an immutable snapshot. Consumers invoke by flow name; tag routing and rollback are a kitaru flow tag away.
Artifact lineage. Every checkpoint writes a typed, versioned artifact. Diff artifacts across runs, trace a bad output back to the specific step, and build audit trails without grepping logs.
Execution placement. @checkpoint(runtime="isolated") runs a specific step in its own pod or job on Kubernetes, AWS, GCP, Azure. Heavy or risky steps stay isolated; orchestration stays inline.
Self-hosted infrastructure. The Kitaru server is a single Helm-deployed service. Artifacts and state live in your own S3 / GCS / Azure Blob bucket. No mandatory SaaS control plane.

What makes Kitaru unique

Feature	Kitaru	Pydantic AI
Typed agent inputs, outputs, tools	Not supported	Yes
Structured model outputs	Not supported	Yes
Dependency injection for tools	Not supported	Yes
Durable execution + replay from checkpoint	Yes	Not supported
Pause/resume with compute released	Yes	Not supported
Versioned, invocable deployments with tag routing	Yes	Not supported
Artifact lineage across runs	Yes	Not supported
Per-checkpoint isolated runtime on your stack	Yes	Not supported
First-class Kubernetes, AWS, GCP, Azure deployment	Yes	Not supported
Durable memory scopes (Python, CLI, MCP)	Yes	Not supported
MCP server for AI-assistant introspection	Yes	Not supported
First-class adapter for the other	Yes	Not supported

How the two surfaces map

Concept	Pydantic AI	Kitaru
Layer	Agent harness (how the agent thinks)	Durable runtime (how it runs over time and infra)
Durable unit	Agent run; durable execution via Temporal / DBOS / Prefect / Restate integrations	`@flow` + `@checkpoint` (Kitaru-native)
Composition	Standalone Pythonic agent	Pydantic AI agent wrapped by `KitaruAgent` adapter inside a `@checkpoint`
Crash recovery	—	Replay from the last good checkpoint, cached work reused
Long wait on a human	Deferred tools + HITL approval patterns; durable pause/resume depends on the chosen runtime integration	`kitaru.wait()` (compute released, survives crashes)
Artifacts	—	Typed, versioned artifacts per checkpoint
Versioning	—	Named versioned deployments with tag routing
Deployment	Whatever Python service you wrap the agent in	Stack-based deploy to Kubernetes, AWS, GCP, Azure

Code comparison

Pydantic AI + Kitaru Recommended

from kitaru import flow, checkpoint, wait
from kitaru.adapters.pydantic_ai import KitaruAgent
from pydantic_ai import Agent

reviewer = KitaruAgent(
  Agent("openai:gpt-5.4", system_prompt="You're a compliance reviewer."),
)

@checkpoint
def review_case(case: dict) -> str:
  # One PydanticAI run == one durable checkpoint.
  # Model calls + tool calls tracked as child events.
  return reviewer.run_sync(case).output

@flow
def review_flow(case: dict) -> str:
  draft = review_case(case)
  # Load the text for the human-facing wait question;
  # the raw checkpoint output still flows to the next step.
  draft_text = draft.load()
  ok = wait(name="approve", question=draft_text, schema=bool)
  return draft if ok else "rejected"

# Durable ad-hoc run
review_flow.run(case={"id": "C-001"})

# Or deploy as versioned snapshot, invoke by name
review_flow.deploy(case={"id": "C-001"})
review_flow.invoke(case={"id": "C-001"})

Pydantic AI alone

from pydantic_ai import Agent

reviewer = Agent(
  "openai:gpt-5.4",
  system_prompt="You're a compliance reviewer.",
)

def review_flow(case: dict) -> str:
  draft = reviewer.run_sync(case).output
  # Blocking input(). If the container dies,
  # the draft is lost.
  ok = input(f"Approve?\n{draft}\n[y/n]: ") == "y"
  return draft if ok else "rejected"

review_flow(case={"id": "C-001"})

Put a runtime under your Pydantic AI agents

If the Pydantic AI agent you wrote is still a notebook script or a short-lived interactive tool, Pydantic AI on its own is the right answer. If it’s becoming a production workload — long-running, crash-surviving, approved by a human hours later, deployed on your own cloud — the durability layer is the stuff Kitaru ships in the box.

pip install kitaru

Book a demo