Compare

Kitaru vs DBOS: Durable execution, shaped around Python agents

DBOS and Kitaru both make code durable. DBOS is polyglot, Postgres-backed, and built for general app workflows. Kitaru is Python-first and shaped for agent loops: LLM calls, wait/resume, artifact lineage, and stack retargeting are primitives.

pip install kitaru
Book a demo Read the docs

DBOS and Kitaru solve the same first-order problem. Code crashes, and you shouldn’t start over. Both are durable-runtime libraries, both are Apache-2.0, both are self-hostable, both do checkpoint, replay, and resume. The honest differences start further down the stack.

DBOS is a polyglot durable workflow library on Postgres. It was shaped for the workloads every backend team ships: order processing, payment sagas, document approval, transactional side effects. Workflow state sits alongside your OLTP data in the same database, and Conductor gives ops a UI for scheduled workflows and durable queues.

We built Kitaru around a different shape of work. An agent loop has expensive LLM calls, tool runs measured in minutes, humans approving drafts hours later, and artifact outputs you want to inspect after the fact. So we put those in the box: kitaru.llm() with alias-resolved secrets and per-checkpoint cost, wait() as a typed primitive, versioned artifact lineage across runs, and stack-based deploy to Kubernetes, SageMaker, Vertex AI, or AzureML without a rewrite.

Kitaru

Use Kitaru if you are

  • Building Python agents and want `@flow`, `@checkpoint`, `kitaru.llm()`, `wait()`, and replay as the base primitives, not patterns you hand-roll on top of steps
  • Replaying a failed run from the last good checkpoint without paying for the LLM calls that already succeeded upstream
  • Retargeting the same agent across Kubernetes, SageMaker, Vertex AI, or AzureML with one stack swap, not a rewrite
  • Wrapping an existing Python agent in durability, or using Kitaru's documented PydanticAI adapter, without rewriting the core agent logic
  • Writing agent loops with runtime branches, tool calls, and human-in-the-loop waits measured in hours or days
Alternative

Use DBOS if you are

  • Running durable workflows across Python, TypeScript, Go, and Java services with one contract
  • Leaning on durable queues, cron schedules, and the Conductor UI for ops
  • Needing DBOS Conductor's managed workflow console, and on paid plans, SOC 2/HIPAA-compliant tooling
  • Already deep into Postgres and want durable state living in the same database as your app
DBOS was shaped for app workflows on Postgres. Kitaru was shaped for Python agent loops. Both are durable runtimes. The ergonomics diverge at the primitives each one ships.

LLM calls as a first-class primitive

In DBOS, an LLM call is something you make inside your own @DBOS.step. DBOS can emit OpenTelemetry-compatible logs and traces for workflows and applications, so LLM visibility lands in Conductor if you wire up provider instrumentation yourself. That gets you visibility. It doesn’t get you a primitive. We built kitaru.llm() so the call itself is part of the runtime.

DBOS · @DBOS.step
openai.chat.completions.create(prompt=…) OTel traces via instrumentation. You own alias resolution and secrets.
Kitaru · kitaru.llm() checkpoint: research · exec_id 9f2a…
kitaru.llm(prompt, model="fast")
prompt
Research: Durable agents…
response
Durable execution keeps…
tokens1,247
latency2.8s
cost$0.028
Logs roll up under the enclosing checkpoint. No separate LLM pipeline.
  • Alias-resolved secrets: kitaru.llm(prompt, model="fast") resolves the model alias and injects the provider key from the active stack. The same call retargets from OpenAI to Anthropic to a self-hosted model without touching code.
  • Per-checkpoint cost roll-up: Latency, token counts, and cost land under the enclosing @checkpoint automatically. Run inspection shows you the cost of step 9 without a separate LLM logging pipeline.
  • Replay: On replay, the captured response is loaded from the checkpoint. The provider isn’t hit again unless the input changed.

Artifact versioning across every checkpoint

Every @checkpoint in Kitaru persists its inputs and outputs as versioned artifacts. Not just cached for resumption. Referenced, diffed, loaded into the next run, inspected through the UI.

exec_id 9f2a today · 14:02
@checkpoint research $0.03
brief.json v3
@checkpoint draft $0.08
draft.md v3
diff across runs
exec_id 7c1b Mon · 11:47
@checkpoint research $0.04
brief.json v1
@checkpoint draft $0.09
draft.md v1
DBOS: step I/O is persisted in Postgres to resume workflows. It isn't surfaced as versioned, cross-run artifacts with a dedicated diff view.
  • Cross-run lineage: Outputs from yesterday’s run are artifacts you can reference or diff against today’s. DBOS step I/O is persisted in Postgres to resume a workflow. It isn’t surfaced as versioned, cross-run artifacts with a dedicated diff view.
  • Payload shape: Artifacts live in your own S3, GCS, or Azure Blob. When a checkpoint output is a 50MB model completion or a document bundle, that’s the right storage. DBOS keeps durable state next to your OLTP data, which is the right call when durable state is transactional rows.
  • Same UI as the run: Artifacts, checkpoints, logs, and costs show up in one place. No psql console required to reconstruct what the agent produced.

Stack-based deploy to your own cloud

DBOS is a library. You deploy the Python app however you already deploy Python apps, and state lives in whatever Postgres you point it at. That’s simple and portable. Kitaru treats “where does this run” as a stack, and a stack is the unit of retarget.

DBOS Library embedded in your Python service
your_app.py import DBOS
deploy anywhere Python deploys your container · your VM · your FaaS
your Postgres
Kitaru Stack-targeted deploy, one config swap
kitaru build package flow as container
kitaru deploy --stack sagemaker
K8s kubernetes
AWS SageMaker
GCP Vertex AI
Az AzureML
artifacts in your own S3 · GCS · Azure Blob
  • One config swap, four clouds: kitaru deploy --stack sagemaker moves the same flow from your laptop to AWS SageMaker. Swap sagemaker for kubernetes, vertex, or azureml and the flow retargets without a rewrite.
  • Cloud integration for free: IAM, GPU quotas, autoscaling, and the right object store are inherited from the target stack. Under DBOS, that’s wiring you do yourself in the deployment tool of your choice.
  • Build once, serve once, deploy anywhere: kitaru build packages the flow as a container. kitaru serve runs it locally with the full durability contract. kitaru deploy ships it to the remote stack. Same code path each time.

Named deployments you invoke by name

DBOS gives you scheduled workflows and durable queues with concurrency limits and deduplication. That’s the operational shape: queue a job, assign a workflow ID from your code, recover on failure. Kitaru ships a different shape: immutable versioned deployments you invoke by name.

DBOS Queue + schedule shape
@DBOS.scheduled("0 * * * *") cron on the workflow
Queue("work", concurrency=8) concurrency + dedup
workflow_id = uuid4() you assign it; app code starts the run
Kitaru Named, versioned deployment with tag routing
review_flow @v3
default staging preview
Invoke by name
CLI SDK MCP curl
Workspace-scoped keys. No per-deployment tokens to rotate.
  • Named, versioned, tag-routed: A flow ships as review_flow@v3. Tags (default, staging, whatever you want) route traffic. Roll a new version forward by moving a tag, not redeploying every caller.
  • Invoke from anywhere: CLI, Python SDK, MCP server, or the generated curl snippet. The dashboard gives you a handle the rest of your stack can call. DBOS workflows start from application code with a workflow ID you assign.
  • Workspace-scoped auth: No per-deployment tokens to rotate when you ship a new version. Keys are scoped to the workspace, not the individual flow.
  • Scheduling is honest: DBOS ships @DBOS.scheduled() with cron and Queue with concurrency + dedup. Kitaru stacks lean on the underlying cloud’s scheduler or a periodic invocation at the call site. If cron and queues are your happy path, DBOS has a thicker story there.

What makes Kitaru unique

Feature Kitaru DBOS
Durable execution with checkpoint replay Yes Yes
Replay skips completed steps without re-running LLM calls Yes Yes
`kitaru.llm()` primitive with alias-resolved keys and per-checkpoint cost roll-up Yes Not supported
Artifact versioning across runs (diffable across executions) Yes Not supported
Typed `wait()` with schema-validated input Yes Not supported
Isolated checkpoints (`runtime="isolated"`) for container-per-step on remote stacks Yes Not supported
Stack retarget to Kubernetes, SageMaker, Vertex AI, AzureML Yes Not supported
Named, versioned deployments with tag routing Yes Not supported
Documented framework adapter for PydanticAI (more coming) Yes Not supported
Polyglot SDKs (Python, TypeScript, Go, Java) Not supported Yes
Native cron scheduling and durable queues Not supported Yes
SOC 2 / HIPAA compliant managed control plane Not supported Yes

How the two surfaces map

ConceptDBOSKitaru
Workflow boundary@DBOS.workflow()@flow
Durable step@DBOS.step()@checkpoint (ordinary Python)
State backendPostgres, alongside your OLTP dataYour own S3 / GCS / Azure Blob
LLM callInside a @DBOS.step; OTel-compatible traces and logs if you wire up provider instrumentationkitaru.llm() with alias-resolved keys and per-checkpoint cost
Pause / resumeDBOS.send / DBOS.recv messagingkitaru.wait() with schema-validated input
Cross-run stateBring your own storekitaru.memory with scopes
Scheduling@DBOS.scheduled() (cron) + Queue with concurrency + dedupStack scheduler, or periodic invocation at the call site
DeploymentLibrary embedded in whatever Python app you already shipNamed, versioned deployments with tag routing

Code comparison

Kitaru Recommended
from kitaru import flow, checkpoint, wait
import kitaru

@checkpoint
def research(topic: str) -> str:
  return kitaru.llm(
      prompt=f"Research: {topic}. Return a brief.",
      model="fast",
  )

@checkpoint
def draft(brief: str) -> str:
  return kitaru.llm(
      prompt=f"Write a draft from this brief:\n{brief}",
      model="claude-sonnet-4-6",
  )

@flow
def review_flow(topic: str) -> str:
  brief = research(topic)
  text = draft(brief)

  approved = wait(
      name="approve",
      schema=bool,
      question=f"Approve this draft?\n\n{text[:500]}",
  )
  return text if approved else "Rejected"

review_flow.run("Durable agents")
DBOS (Python SDK)
from dbos import DBOS
import openai

client = openai.OpenAI()

@DBOS.step()
def research(topic: str) -> str:
  resp = client.chat.completions.create(
      model="gpt-4o-mini",
      messages=[{"role": "user",
                 "content": f"Research: {topic}"}],
  )
  return resp.choices[0].message.content

@DBOS.step()
def draft(brief: str) -> str:
  resp = client.chat.completions.create(
      model="gpt-4o",
      messages=[{"role": "user",
                 "content": f"Write a draft:\n{brief}"}],
  )
  return resp.choices[0].message.content

@DBOS.workflow()
def review_flow(topic: str) -> str:
  brief = research(topic)
  text = draft(brief)

  # Human approval via DBOS.send / recv messaging.
  approved = DBOS.recv(timeout_seconds=86400)
  return text if approved else "Rejected"

# Plus: start the DBOS runtime, send approval
# from another process via DBOS.send.

Durable execution, shaped for your Python agent

If your durability problem is polyglot services against Postgres with a SOC 2 or HIPAA control plane on day one, DBOS is better-shaped for that workload — be honest about which side you’re on. If it’s shaped like a Python agent loop (LLM calls, memory, artifacts, human/agent approval gates), Kitaru removes the glue layer you’d otherwise write on top of a general-purpose workflow engine.

pip install kitaru
Book a demo