Research·2026-03-15·Synerthink

Why Every Run Needs an Artifact Bundle

Reliable AI systems need a portable record of execution, not just logs

Modern systems produce a lot of data about execution, but very little of it feels like the execution itself.

A run fails, and the evidence is split across logs, database rows, traces, and whatever state happened to persist before the crash. You can usually see pieces of the story. What you rarely get is one portable object that answers the basic questions: what ran, what it saw, what it did, and where it failed.

That gap matters more in AI systems. An agent run is not just a request with a response. It has inputs, intermediate decisions, tool calls, capability checks, state changes, and side effects. If you want reliable agents, you need more than telemetry. You need a stable record of the run itself.

The core problem

Most runtimes treat execution output as an afterthought.

Logs are useful, but they are only fragments. A trace may show timing and order, but not always the full input or final state. A database can store run history, but a few tables are not the same thing as a portable execution record.

This leads to a common failure mode: the runtime produces evidence, but not a unit of evidence.

Debugging failures takes reconstruction work.
Comparing two runs becomes harder than it should be.
Exporting a run to another machine is awkward.
Replay depends on best-effort logs instead of a stable contract.

Architectural insight

The key idea is simple: a run should emit an artifact, not just write records.

That artifact should be the first thing tooling reaches for. It should be the object you inspect, export, replay, or compare. It should exist whether the run succeeds or fails.

Mental model

intent -> execution -> artifact bundle
                       |
                       v
         inspect -> export -> replay -> compare

Once you adopt that model, several things get easier. Inspection is simpler because the operator knows what to look for. Replay becomes more practical because the input snapshot and trace are part of a stable output. Failures become easier to reason about because a failed run still leaves behind a partial artifact instead of scattered clues.

This fits naturally with Dotlanth. DotVM gives execution a deterministic boundary. DotDB gives the runtime a durable local store. Replayable compute needs stable traces and snapshots. The artifact bundle is the piece that turns those ideas into a usable interface.

The decision

In Dotlanth, every execution now produces an Artifact Bundle.

The bundle is the main interface for inspecting a run, exporting it, and replaying it. That rule applies to successful runs and failed runs. If a run stops halfway through, the bundle can be partial, but missing data must be explicit. The runtime should never silently skip part of the output and pretend the run is complete.

In practice, the bundle carries the parts of execution that matter most:

input snapshot
execution trace
outcome or error state
capability usage
versioned metadata for tooling

Simplified bundle shape

artifact-bundle/
  manifest.json
  input_snapshot.json
  trace.jsonl
  capability_report.json
  errors.json

DotDB still matters here, but in a narrower role. It remains the local run index and stores references to bundles. The bundle is the portable execution artifact. DotDB is how the local machine finds it quickly.

Alternatives considered

Logs as the primary output

Logs help humans read a story after the fact, but they are not a strong interface for tools. They are too easy to make inconsistent across success and failure paths, and they do not give replay a stable contract.

Optional bundles

Optional artifacts sound flexible, but they quickly become unreliable. The moment bundles are best-effort, tooling has to assume they may not exist. That brings back the same ambiguity we are trying to remove.

Bundle-only storage

Bundle-only storage is cleaner in one sense, but local workflows still need indexing. Engineers want simple queries like “show me the last failed run” or “find the bundle for this run ID.” DotDB is still useful for that, so the current split is pragmatic: DotDB indexes runs locally, and the bundle is the thing you inspect or move.

Consequences

The biggest benefit is clarity. Inspection, export, and replay now point at the same object. Tooling no longer has to depend directly on whatever internal tables or logs the runtime happened to produce.

Each run has a stable output.
Auditability improves because inputs and behavior travel together.
Replay has a stronger base than best-effort logs.

The tradeoffs are real too. A bundle contract forces schema decisions earlier. Once you publish a layout and version it, you need to maintain it carefully. It also raises the bar for failure handling because partial output has to be explicit and well-formed.

There is also a security cost. Bundles may contain sensitive inputs, request metadata, and logs. They should be treated as private runtime artifacts unless they have been reviewed or scrubbed for sharing.

What this enables

This is about more than nicer debugging. It creates a better base for replayable execution, run comparison, and future tools that need a stable target instead of direct access to internal runtime storage.

It also moves Dotlanth closer to something important for AI-native systems: execution that can explain itself after the fact. Reliable agents will need runs that can be inspected, exported, replayed, and audited without depending on live process state or one specific machine.

The artifact bundle gives each run that durable shape.

Closing

A runtime becomes easier to trust when every run leaves behind a usable artifact.

For Dotlanth, this is one of the first practical steps toward deterministic infrastructure for AI systems: every execution should leave behind enough structured evidence to be understood later.