Why Every Run Needs an Artifact Bundle
Reliable AI systems need a portable record of execution, not just logs
Reliable AI systems need a portable record of execution, not just logs
Modern systems produce a lot of data about execution, but very little of it feels like the execution itself.
A run fails, and the evidence is split across logs, database rows, traces, and whatever state happened to persist before the crash. You can usually see pieces of the story. What you rarely get is one portable object that answers the basic questions: what ran, what it saw, what it did, and where it failed.
That gap matters more in AI systems. An agent run is not just a request with a response. It has inputs, intermediate decisions, tool calls, capability checks, state changes, and side effects. If you want reliable agents, you need more than telemetry. You need a stable record of the run itself.
Most runtimes treat execution output as an afterthought.
Logs are useful, but they are only fragments. A trace may show timing and order, but not always the full input or final state. A database can store run history, but a few tables are not the same thing as a portable execution record.
This leads to a common failure mode: the runtime produces evidence, but not a unit of evidence.
The key idea is simple: a run should emit an artifact, not just write records.
That artifact should be the first thing tooling reaches for. It should be the object you inspect, export, replay, or compare. It should exist whether the run succeeds or fails.
Mental model
intent -> execution -> artifact bundle
|
v
inspect -> export -> replay -> compareOnce you adopt that model, several things get easier. Inspection is simpler because the operator knows what to look for. Replay becomes more practical because the input snapshot and trace are part of a stable output. Failures become easier to reason about because a failed run still leaves behind a partial artifact instead of scattered clues.
This fits naturally with Dotlanth. DotVM gives execution a deterministic boundary. DotDB gives the runtime a durable local store. Replayable compute needs stable traces and snapshots. The artifact bundle is the piece that turns those ideas into a usable interface.
In Dotlanth, every execution now produces an Artifact Bundle.
The bundle is the main interface for inspecting a run, exporting it, and replaying it. That rule applies to successful runs and failed runs. If a run stops halfway through, the bundle can be partial, but missing data must be explicit. The runtime should never silently skip part of the output and pretend the run is complete.
In practice, the bundle carries the parts of execution that matter most:
Simplified bundle shape
artifact-bundle/ manifest.json input_snapshot.json trace.jsonl capability_report.json errors.json
DotDB still matters here, but in a narrower role. It remains the local run index and stores references to bundles. The bundle is the portable execution artifact. DotDB is how the local machine finds it quickly.
Logs help humans read a story after the fact, but they are not a strong interface for tools. They are too easy to make inconsistent across success and failure paths, and they do not give replay a stable contract.
Optional artifacts sound flexible, but they quickly become unreliable. The moment bundles are best-effort, tooling has to assume they may not exist. That brings back the same ambiguity we are trying to remove.
Bundle-only storage is cleaner in one sense, but local workflows still need indexing. Engineers want simple queries like “show me the last failed run” or “find the bundle for this run ID.” DotDB is still useful for that, so the current split is pragmatic: DotDB indexes runs locally, and the bundle is the thing you inspect or move.
The biggest benefit is clarity. Inspection, export, and replay now point at the same object. Tooling no longer has to depend directly on whatever internal tables or logs the runtime happened to produce.
The tradeoffs are real too. A bundle contract forces schema decisions earlier. Once you publish a layout and version it, you need to maintain it carefully. It also raises the bar for failure handling because partial output has to be explicit and well-formed.
There is also a security cost. Bundles may contain sensitive inputs, request metadata, and logs. They should be treated as private runtime artifacts unless they have been reviewed or scrubbed for sharing.
This is about more than nicer debugging. It creates a better base for replayable execution, run comparison, and future tools that need a stable target instead of direct access to internal runtime storage.
It also moves Dotlanth closer to something important for AI-native systems: execution that can explain itself after the fact. Reliable agents will need runs that can be inspected, exported, replayed, and audited without depending on live process state or one specific machine.
The artifact bundle gives each run that durable shape.
A runtime becomes easier to trust when every run leaves behind a usable artifact.
For Dotlanth, this is one of the first practical steps toward deterministic infrastructure for AI systems: every execution should leave behind enough structured evidence to be understood later.