Compare

Kitaru vs Temporal: Durable execution, built for AI agents

Temporal is a general-purpose durable execution platform with seven official SDKs. Kitaru is an open-source runtime for durable Python agents, with built-in primitives for LLM calls, wait/resume, replay, memory, and artifact handling.

pip install kitaru
Book a demo Read the docs

Temporal is a general-purpose durable execution platform with seven official SDKs (Go, Java, Python, TypeScript, Ruby, PHP, and .NET). It has been in production for a decade and has the battle scars to show for it. If your durability problem is polyglot or mission-critical and not agent-shaped, Temporal’s track record is the pragmatic choice.

Kitaru is the runtime layer shaped for Python agents — the same durable execution story, narrower in scope, with agent primitives in the box. The glue code every agent team ends up writing on top of a general-purpose workflow engine (a durable llm() call, a versioned memory store, an artifact graph, an opinionated stack abstraction for Kubernetes, AWS, GCP, and Azure), we ship as first-class primitives.

Kitaru

Use Kitaru if you are

  • Running Python agents and want LLM calls, memory, and artifact lineage as primitives instead of glue code
  • Replaying a run from a specific checkpoint without paying for the LLM calls above it again
  • Deploying into your own cloud (Kubernetes, SageMaker, Vertex AI, AzureML) and want one runtime that targets all of them
  • Designing around dynamic agent loops (tool calls, conditional branches, hours-long waits for humans)
Alternative

Use Temporal if you are

  • Operating a polyglot fleet where Go, Java, and TypeScript services must share one durability contract
  • Running general workflows (billing, provisioning, ETL, saga patterns), not specifically agents
  • Leaning heavily on cron, namespacing, and a battle-tested service tier
Temporal makes failure irrelevant. Kitaru makes failure irrelevant for the specific shape of work an AI agent does.

A simpler ops model for agent workloads

Temporal’s durable execution relies on the Temporal Service plus Workers; you can self-host it or use Temporal Cloud. Kitaru uses a Kitaru server plus stack-backed storage and compute. Both provide durable recovery, but their operational models are different.

Temporal cluster
Temporal Service + Workers
Frontend
History
Matching
Persistence DB (your Postgres / MySQL / Cassandra)
Workers (your code)
Kitaru + stack
Kitaru server + stack storage
@flow review_flow
@checkpoint draft
S3 · GCS · Blob
  • Service tier: Temporal Server consists of Frontend, History, Matching, and Worker services, backed by a persistence database. Kitaru’s server stores execution metadata, checkpoint state, and logs, while stack backends use storage such as S3, GCS, or Azure Blob depending on the runtime.
  • Determinism: Temporal Workflow code must follow deterministic constraints, and non-deterministic work such as external calls belongs in Activities. Kitaru lets you wrap plain Python with @flow and @checkpoint and reuse checkpoint outputs on replay.
  • Replay cost: Temporal caches Activity results via Workflow Event History — the difference is Kitaru’s caching boundary is an ordinary Python function, not a deterministic Workflow/Activity split. kitaru executions replay <exec_id> --from <checkpoint> re-runs the flow from the top; checkpoints before the replay point return cached output, and the named checkpoint and anything after it re-execute — so you don’t re-pay for the LLM calls above the replay point.

LLM calls as first-class checkpoints

The LLM call is often the unit of cost, latency, and failure in an agent. In Temporal, LLM calls typically live in Activities. In Kitaru, kitaru.llm() is a built-in primitive.

Temporal Activity
await call_openai(prompt) unstructured by default · instrument it yourself
Kitaru · @checkpoint exec_id 9f2a…
kitaru.llm(prompt, model="fast")
prompt
Research: Durable agents…
response
Durable execution keeps…
tokens1,247
latency1.4s
modelfast → claude-sonnet-4-6
On replay, response is loaded from checkpoint. Provider isn't hit again.
  • Key handling: In Temporal, API-key handling is part of your Activity or application code. In Kitaru, kitaru.llm() resolves model aliases and supports centralized secret handling.
  • Observability: Temporal has a mature Web UI for workflow and activity visibility. What it leaves to you is the LLM-specific slice — prompt capture, token counts, latency, model identity. Kitaru captures prompt/response as artifacts and logs token counts, latency, and the resolved model on every kitaru.llm() call automatically. Cost accounting is glue you still write on top, but the raw material arrives for free.
  • Replay: In Kitaru, the captured response is read from the checkpoint on replay. The provider isn’t hit again unless the input changed.

Versioned durable memory, not workflow variables

Temporal is a general-purpose workflow engine, so cross-run memory is an application concern — you bring your own store and plumb it through workflows. Kitaru is agent-shaped, so it ships a scoped memory primitive in the runtime. Same problem, different shape.

kitaru.memory scope namespace versioned · persists across runs
  • v3 preferences.tone "formal, brief" today · 14:02
  • v2 preferences.tone "casual" yesterday · 09:18
  • v1 preferences.tone "neutral" Mon · 11:47
exec #1 Mon
exec #2 yesterday
exec #3 today
1 exec Temporal: workflow variables are scoped to one execution, reconstituted from event history.
  • Scope: Temporal durable state lives inside a workflow execution. In Kitaru, kitaru.memory is persisted with explicit namespace, flow, or execution scopes, so some memory can outlive a single run.
  • Escape hatch: Anything that doesn’t fit a KV shape goes through kitaru.save() / kitaru.load(), backed by the same artifact store.
  • Inspection: Kitaru’s Python surface exposes set, get, list, history, and delete on kitaru.memory, plus a kitaru memory scope list CLI command — so “what did I know last Tuesday?” is a documented call, not a workflow-variable archaeology job.

Artifact lineage across runs, not just event history

Temporal’s Workflow Event History is a replay log, not an artifact store. Inspecting what a past workflow produced means threading your own artifact references through Activity return values. Kitaru persists checkpoint outputs as artifacts automatically and attaches them to the execution record.

exec_id 7c1b Mon · 11:47 total $0.15
@checkpoint research $0.04
brief.json
@checkpoint draft $0.09
draft.md
@checkpoint review $0.02
review.json
kitaru.load() reuses brief.json
exec_id 9f2a today · 14:02 total $0.10
@checkpoint research loaded
brief.json · from 7c1b
@checkpoint draft $0.08
draft.md
@checkpoint review $0.02
review.json
  • Artifacts: Kitaru stores each @checkpoint return value, plus prompt/response pairs from every kitaru.llm() call, as browsable artifacts on the execution.
  • Cross-run load: A later run can pull an artifact from an earlier execution via kitaru.load(exec_id, name) from inside a @checkpoint. Artifact lineage across runs is a primitive, not something you build on top.
  • Inspection surface: Executions, checkpoints, logs, and artifacts are exposed through the server, CLI, and client APIs — and every kitaru.llm() call contributes token counts, latency, and resolved model to the same record.

What makes Kitaru unique

Feature Kitaru Alternative
Durable execution / recover after failure Yes Yes
Recovery/replay avoids re-executing completed work after failure Yes Yes
Prompt and token logging per `kitaru.llm()` call Yes Not supported
Versioned durable memory across runs Yes Not supported
Artifact lineage and run tracking across executions Yes Not supported
Opinionated stack abstraction for Kubernetes, AWS, GCP, Azure (configure once, every flow uses it) Yes Not supported
Durable human-in-the-loop waiting with no active compute Yes Yes
Polyglot SDKs (Go, Java, TypeScript, Ruby, PHP, .NET) Not supported Yes
Native cron scheduling and namespacing Not supported Yes
Execution inspection, logs, and lifecycle control Yes Yes

How the two surfaces map

ConceptTemporalKitaru
Workflow boundary@workflow.defn@flow
Durable stepActivity (non-deterministic)@checkpoint (ordinary Python)
Determinism requirementWorkflow must be deterministicNo determinism requirement on flow body
Pause / resumewait_condition + signalkitaru.wait()
InvocationWorkflow ID + start via clientflow.run(), kitaru invoke, CLI, SDK, MCP, curl
Cross-run stateBring your own storekitaru.memory with scopes
ArtifactsThread through Activity returnsAutomatic per-checkpoint artifact capture

Code comparison

Kitaru Recommended
import kitaru
from kitaru import checkpoint, flow

@checkpoint
def research(topic: str) -> str:
  return kitaru.llm(
      prompt=f"Research: {topic}. Return a brief.",
      model="fast",
  )

@checkpoint
def draft(brief: str) -> str:
  return kitaru.llm(
      prompt=f"Write a draft from this brief:\n{brief}",
      model="fast",
  )

@flow
def review_flow(topic: str) -> str:
  brief = research(topic)
  text = draft(brief)
  approved = kitaru.wait(
      name="approve_draft",
      question="Approve draft?",
      schema=bool,
  )
  return text if approved else "Rejected"

review_flow.run("Durable agents")
Temporal (Python SDK)
from datetime import timedelta
from temporalio import activity, workflow
from temporalio.client import Client
from temporalio.worker import Worker

@activity.defn
async def research(topic: str) -> str:
  return await call_llm(f"Research: {topic}")

@activity.defn
async def draft(brief: str) -> str:
  return await call_llm(f"Write a draft:\n{brief}")

@workflow.defn
class ReviewFlow:
  def __init__(self) -> None:
      self._approved: bool | None = None

  @workflow.signal
  def approve(self, ok: bool) -> None:
      self._approved = ok

  @workflow.run
  async def run(self, topic: str) -> str:
      brief = await workflow.execute_activity(
          research, topic, start_to_close_timeout=timedelta(minutes=5)
      )
      text = await workflow.execute_activity(
          draft, brief, start_to_close_timeout=timedelta(minutes=5)
      )
      await workflow.wait_condition(lambda: self._approved is not None)
      return text if self._approved else "Rejected"

# Run via: await client.execute_workflow(ReviewFlow.run, topic,
#   id=..., task_queue=...); approval arrives via
# client.get_workflow_handle(...).signal(ReviewFlow.approve, True).

The runtime layer underneath your Python agents

If your durability problem spans Go services, Java backends, and cron-scheduled ETL, Temporal is the tool. If it’s shaped like a Python agent (LLM calls, memory, tool outputs, humans in the loop), Kitaru removes the glue layer you’d otherwise write on top.

pip install kitaru
Book a demo