DBOS and Kitaru solve the same first-order problem. Code crashes, and you shouldn’t start over. Both are durable-runtime libraries, both are Apache-2.0, both are self-hostable, both do checkpoint, replay, and resume. The honest differences start further down the stack.
DBOS is a polyglot durable workflow library on Postgres. It was shaped for the workloads every backend team ships: order processing, payment sagas, document approval, transactional side effects. Workflow state sits alongside your OLTP data in the same database, and Conductor gives ops a UI for scheduled workflows and durable queues.
We built Kitaru around a different shape of work. An agent loop has expensive LLM calls, tool runs measured in minutes, humans approving drafts hours later, and artifact outputs you want to inspect after the fact. So we put those in the box: kitaru.llm() with alias-resolved secrets and per-checkpoint cost, wait() as a typed primitive, versioned artifact lineage across runs, and stack-based deploy to Kubernetes, SageMaker, Vertex AI, or AzureML without a rewrite.
Use Kitaru if you are
- Building Python agents and want `@flow`, `@checkpoint`, `kitaru.llm()`, `wait()`, and replay as the base primitives, not patterns you hand-roll on top of steps
- Replaying a failed run from the last good checkpoint without paying for the LLM calls that already succeeded upstream
- Retargeting the same agent across Kubernetes, SageMaker, Vertex AI, or AzureML with one stack swap, not a rewrite
- Wrapping an existing Python agent in durability, or using Kitaru's documented PydanticAI adapter, without rewriting the core agent logic
- Writing agent loops with runtime branches, tool calls, and human-in-the-loop waits measured in hours or days
Use DBOS if you are
- Running durable workflows across Python, TypeScript, Go, and Java services with one contract
- Leaning on durable queues, cron schedules, and the Conductor UI for ops
- Needing DBOS Conductor's managed workflow console, and on paid plans, SOC 2/HIPAA-compliant tooling
- Already deep into Postgres and want durable state living in the same database as your app
DBOS was shaped for app workflows on Postgres. Kitaru was shaped for Python agent loops. Both are durable runtimes. The ergonomics diverge at the primitives each one ships.
LLM calls as a first-class primitive
In DBOS, an LLM call is something you make inside your own @DBOS.step. DBOS can emit OpenTelemetry-compatible logs and traces for workflows and applications, so LLM visibility lands in Conductor if you wire up provider instrumentation yourself. That gets you visibility. It doesn’t get you a primitive. We built kitaru.llm() so the call itself is part of the runtime.
openai.chat.completions.create(prompt=…) OTel traces via instrumentation. You own alias resolution and secrets. - prompt
- Research: Durable agents…
- response
- Durable execution keeps…
- Alias-resolved secrets:
kitaru.llm(prompt, model="fast")resolves the model alias and injects the provider key from the active stack. The same call retargets from OpenAI to Anthropic to a self-hosted model without touching code. - Per-checkpoint cost roll-up: Latency, token counts, and cost land under the enclosing
@checkpointautomatically. Run inspection shows you the cost of step 9 without a separate LLM logging pipeline. - Replay: On replay, the captured response is loaded from the checkpoint. The provider isn’t hit again unless the input changed.
Artifact versioning across every checkpoint
Every @checkpoint in Kitaru persists its inputs and outputs as versioned artifacts. Not just cached for resumption. Referenced, diffed, loaded into the next run, inspected through the UI.
- Cross-run lineage: Outputs from yesterday’s run are artifacts you can reference or diff against today’s. DBOS step I/O is persisted in Postgres to resume a workflow. It isn’t surfaced as versioned, cross-run artifacts with a dedicated diff view.
- Payload shape: Artifacts live in your own S3, GCS, or Azure Blob. When a checkpoint output is a 50MB model completion or a document bundle, that’s the right storage. DBOS keeps durable state next to your OLTP data, which is the right call when durable state is transactional rows.
- Same UI as the run: Artifacts, checkpoints, logs, and costs show up in one place. No
psqlconsole required to reconstruct what the agent produced.
Stack-based deploy to your own cloud
DBOS is a library. You deploy the Python app however you already deploy Python apps, and state lives in whatever Postgres you point it at. That’s simple and portable. Kitaru treats “where does this run” as a stack, and a stack is the unit of retarget.
- One config swap, four clouds:
kitaru deploy --stack sagemakermoves the same flow from your laptop to AWS SageMaker. Swapsagemakerforkubernetes,vertex, orazuremland the flow retargets without a rewrite. - Cloud integration for free: IAM, GPU quotas, autoscaling, and the right object store are inherited from the target stack. Under DBOS, that’s wiring you do yourself in the deployment tool of your choice.
- Build once, serve once, deploy anywhere:
kitaru buildpackages the flow as a container.kitaru serveruns it locally with the full durability contract.kitaru deployships it to the remote stack. Same code path each time.
Named deployments you invoke by name
DBOS gives you scheduled workflows and durable queues with concurrency limits and deduplication. That’s the operational shape: queue a job, assign a workflow ID from your code, recover on failure. Kitaru ships a different shape: immutable versioned deployments you invoke by name.
- Named, versioned, tag-routed: A flow ships as
review_flow@v3. Tags (default,staging, whatever you want) route traffic. Roll a new version forward by moving a tag, not redeploying every caller. - Invoke from anywhere: CLI, Python SDK, MCP server, or the generated curl snippet. The dashboard gives you a handle the rest of your stack can call. DBOS workflows start from application code with a workflow ID you assign.
- Workspace-scoped auth: No per-deployment tokens to rotate when you ship a new version. Keys are scoped to the workspace, not the individual flow.
- Scheduling is honest: DBOS ships
@DBOS.scheduled()with cron andQueuewith concurrency + dedup. Kitaru stacks lean on the underlying cloud’s scheduler or a periodic invocation at the call site. If cron and queues are your happy path, DBOS has a thicker story there.
What makes Kitaru unique
| Feature | Kitaru | DBOS |
|---|---|---|
| Durable execution with checkpoint replay | Yes | Yes |
| Replay skips completed steps without re-running LLM calls | Yes | Yes |
| `kitaru.llm()` primitive with alias-resolved keys and per-checkpoint cost roll-up | Yes | Not supported |
| Artifact versioning across runs (diffable across executions) | Yes | Not supported |
| Typed `wait()` with schema-validated input | Yes | Not supported |
| Isolated checkpoints (`runtime="isolated"`) for container-per-step on remote stacks | Yes | Not supported |
| Stack retarget to Kubernetes, SageMaker, Vertex AI, AzureML | Yes | Not supported |
| Named, versioned deployments with tag routing | Yes | Not supported |
| Documented framework adapter for PydanticAI (more coming) | Yes | Not supported |
| Polyglot SDKs (Python, TypeScript, Go, Java) | Not supported | Yes |
| Native cron scheduling and durable queues | Not supported | Yes |
| SOC 2 / HIPAA compliant managed control plane | Not supported | Yes |
How the two surfaces map
| Concept | DBOS | Kitaru |
|---|---|---|
| Workflow boundary | @DBOS.workflow() | @flow |
| Durable step | @DBOS.step() | @checkpoint (ordinary Python) |
| State backend | Postgres, alongside your OLTP data | Your own S3 / GCS / Azure Blob |
| LLM call | Inside a @DBOS.step; OTel-compatible traces and logs if you wire up provider instrumentation | kitaru.llm() with alias-resolved keys and per-checkpoint cost |
| Pause / resume | DBOS.send / DBOS.recv messaging | kitaru.wait() with schema-validated input |
| Cross-run state | Bring your own store | kitaru.memory with scopes |
| Scheduling | @DBOS.scheduled() (cron) + Queue with concurrency + dedup | Stack scheduler, or periodic invocation at the call site |
| Deployment | Library embedded in whatever Python app you already ship | Named, versioned deployments with tag routing |
Code comparison
from kitaru import flow, checkpoint, wait
import kitaru
@checkpoint
def research(topic: str) -> str:
return kitaru.llm(
prompt=f"Research: {topic}. Return a brief.",
model="fast",
)
@checkpoint
def draft(brief: str) -> str:
return kitaru.llm(
prompt=f"Write a draft from this brief:\n{brief}",
model="claude-sonnet-4-6",
)
@flow
def review_flow(topic: str) -> str:
brief = research(topic)
text = draft(brief)
approved = wait(
name="approve",
schema=bool,
question=f"Approve this draft?\n\n{text[:500]}",
)
return text if approved else "Rejected"
review_flow.run("Durable agents") from dbos import DBOS
import openai
client = openai.OpenAI()
@DBOS.step()
def research(topic: str) -> str:
resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user",
"content": f"Research: {topic}"}],
)
return resp.choices[0].message.content
@DBOS.step()
def draft(brief: str) -> str:
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user",
"content": f"Write a draft:\n{brief}"}],
)
return resp.choices[0].message.content
@DBOS.workflow()
def review_flow(topic: str) -> str:
brief = research(topic)
text = draft(brief)
# Human approval via DBOS.send / recv messaging.
approved = DBOS.recv(timeout_seconds=86400)
return text if approved else "Rejected"
# Plus: start the DBOS runtime, send approval
# from another process via DBOS.send. Durable execution, shaped for your Python agent
If your durability problem is polyglot services against Postgres with a SOC 2 or HIPAA control plane on day one, DBOS is better-shaped for that workload — be honest about which side you’re on. If it’s shaped like a Python agent loop (LLM calls, memory, artifacts, human/agent approval gates), Kitaru removes the glue layer you’d otherwise write on top of a general-purpose workflow engine.
pip install kitaru