schema reference ────────────────────────

Schema

2026-06-08

Dataset security policy contract. A workflow declares the security posture of the rows it projects, and each dataset stores its resolved policy in the manifest. Additive only; the TraceRecord wire shape is unchanged.

Read the full schema documentation for design rationale and usage guides, or see contributing to the schema to propose changes.

TraceRecord

Root record. One per session, one JSONL line.

fieldtypedescription
schema_versionstringreqe.g. "0.7.0"
trace_idstringreqUUID for this trace
session_idstringreqAgent's native session ID
content_hashstringSHA-256 hex of the serialized record, used for cross-contributor dedup at upload time. Unchanged by 0.3.0.
timestamp_startstringISO 8601 start
timestamp_endstringISO 8601 end
taskTaskTask metadata
agentAgentreqAgent identity
environmentEnvironmentOS, shell, VCS, languages
system_promptsdictDeduplicated prompts keyed by hash
tool_definitionsdict[]Available tool schemas
stepsStep[]TAO-loop steps
outcomeOutcomeSession outcome
dependenciesstring[]Project dependencies
metricsMetricsAggregated metrics
securitySecurityMetadataSecurity tier and redactions
attributionAttributionCode attribution (experimental)
metadatadictExtensible key-value pairs
execution_contextstring | null"devtime" (code-editing agent) or "runtime" (action-trajectory / RL agent). Null for pre-0.2 traces.
lifecyclestring"provisional" (pre-commit-correlation) or "final" (revision-anchored). Default provisional.
git_linksGitLink[]Evidence-graded links to commits/revisions this trace contributed to.
generation_indexintMonotonic per-session_id generation counter. Consumers resolving 'latest' should group by session_id and take max(generation_index).
context_tree_summarydictSummary of Context Tree capture: node_count, layer_count, active_path_leaf_id, capture_limitations.
patchesPatch[]Authoritative dev-time output set. One Patch per tool-produced change/hunk.
Task

Task metadata for filtering and grouping.

fieldtypedescription
descriptionstringWhat the task is
sourcestringuser_prompt, cli_arg, skill, etc.
repositorystringowner/repo format
base_commitstringStarting commit SHA
repository_urlstringCanonical remote URL, e.g. https://github.com/org/repo
Agent

Agent identity.

fieldtypedescription
namestringreqclaude-code, cursor, codex, etc.
versionstringAgent version
modelstringprovider/model-name
Environment

Runtime context.

fieldtypedescription
osstringdarwin, linux, etc.
shellstringzsh, bash, etc.
vcsVCStype, base_commit, branch, diff
language_ecosystemstring[]python, typescript, etc.
Step

One LLM API call in the TAO loop.

fieldtypedescription
step_indexintreqSequential index
rolestringreqsystem | user | agent
contentstringMessage content
reasoning_contentstringChain-of-thought
modelstringModel for this step
system_prompt_hashstringKey into system_prompts
agent_rolestringmain, explore, plan, etc.
parent_stepintParent step index
call_typestringmain | subagent | warmup
subagent_trajectory_refstringSub-agent session ID
tools_availablestring[]Available tool names
tool_callsToolCall[]Tool invocations
observationsObservation[]Tool results
snippetsSnippet[]Extracted code blocks
token_usageTokenUsageToken breakdown
timestampstringISO 8601
context_node_idstring | nullContext Tree node id for the model view at this step.
ToolCall

A tool invocation within a step.

fieldtypedescription
tool_call_idstringreqID for linking to observations
tool_namestringreqTool name
inputdictInput parameters
duration_msintWall-clock time
Observation

Tool result linked to its ToolCall.

fieldtypedescription
source_call_idstringreqLinks to ToolCall
contentstringFull output
output_summarystringLightweight preview
errorstringError info if failed
TokenUsage

Per-step token breakdown.

fieldtypedescription
input_tokensintInput tokens
output_tokensintOutput tokens
cache_read_tokensintFrom cache
cache_write_tokensintWritten to cache
prefix_reuse_tokensintVia prefix caching
Outcome

Session outcome for reward modeling.

fieldtypedescription
successbooleanGoal achieved
signal_sourcestringDefault: "deterministic"
signal_confidencestringderived | inferred | annotated
descriptionstringOutcome description
committedbooleanChanges committed to git
commit_shastringCommit SHA
terminal_statestring | null"goal_reached", "interrupted", "error", or "abandoned". Meaningful for runtime agents.
rewardfloat | nullNumeric reward signal from an RL environment or evaluator.
reward_sourcestring | nullCanonical values: "rl_environment", "judge", "human_annotation", "orchestrator".
Attribution

Code attribution (experimental).

fieldtypedescription
experimentalbooleanAlways true in v0.1.x
filesAttributionFile[]Per-file line ranges
revisiondictPins this block to a revision. Keys: vcs_type ('git'|'jj'), revision.
unaccounted_filesstring[]Files changed at commit time with no tracked Edit/Write source (e.g. Bash sed edits). Low confidence.
Metrics

Session-level aggregates.

fieldtypedescription
total_stepsintStep count
total_input_tokensintSum of input tokens
total_output_tokensintSum of output tokens
total_duration_sfloatWall-clock seconds
cache_hit_ratefloat0.0 to 1.0
estimated_cost_usdfloatEstimated cost
total_cache_read_tokensintSession-level prompt-cache read aggregate.
total_cache_creation_tokensintSession-level prompt-cache write aggregate.
SecurityMetadata

Security scan summary. Detailed tool output lives under metadata.security.

fieldtypedescription
scannedbooleanWhether security processing was applied to this record.
flags_reviewedintNumber of security flags reviewed.
redactions_appliedintNumber of redactions applied.
classifier_versionstring | nullClassifier tool version when classifier ran.
AttributionRange

A range of lines attributed to an agent conversation.

fieldtypedescription
start_lineintreqFirst attributed line (1-indexed).
end_lineintreqLast attributed line (inclusive).
content_hashstringmurmur3:<32-hex> for cross-refactor tracking.
confidencestringhigh | medium | low.
change_typestring"addition", "modification", or "deletion". Default "addition".
originaldictPre-divergence state when a formatter/human rewrote agent output. Keys: start_line, end_line, content_hash.
contributordictPer-range contributor override (used when the enclosing conversation is 'mixed').
AttributionConversation

Links attributed code ranges to the conversation that produced them.

fieldtypedescription
contributordicte.g. {type: 'ai', model_id: 'anthropic/claude-sonnet-4'}
urlstringopentraces://trace_id/step_N
idsdictProvider-native conversation ids. e.g. {anthropic: 'msg_01xyz', openai: ['resp_1', 'resp_2']}
relateddict[]Links to broader resources. Each entry: {type, url}. e.g. {type: 'plan', url: 'opentraces://t/plan_3'}
rangesAttributionRange[]Attributed line ranges.
GitAnchor

Typed link from a Patch to its appearance in Git.

fieldtypedescription
last_searched_atstringreqISO8601 timestamp set after the first maturation search.
foundbooleanreqWhether a matching commit was found.
commit_shastring | nullMatched commit SHA when found.
pathstring | nullPath in the commit; may differ after rename.
blob_shastring | nullMatched Git blob SHA.
git_patch_idstring | nullGit patch-id, stable across rebase.
evidence_tierstring | nullEvidence match label such as exact_range_hash, patch_id, formatter_divergent, overlapping_hunk, or orphan.
evidence_firmnessstring | nullFirmness label such as firm_observed, provisional, human_asserted, or unknown.
Patch

A trace-produced change. Full patch history resolves through the bucket Trail companion.

fieldtypedescription
patch_idstringreqContent-addressed trace patch id.
file_pathstringreqPath at creation time.
step_indexint | nullProducing step index.
tool_call_idstring | nullProducing tool call id.
capture_methodstring[]Capture methods such as hook_pretooluse, hook_posttooluse, watcher_backstop.
snapshot_before_idstring | nullBefore snapshot id.
snapshot_after_idstring | nullAfter snapshot id.
anchorGitAnchor | nullGit match when the patch matures into a commit.
superseded_bystring[]Commit supersede chain after amend/rebase/squash.
limitationsstring[]Capture quality flags.
Example
{
  "schema_version": "0.7.0",
  "trace_id": "a4f2b8c1-e2d3-4f5a-b6c7-d8e9f0a1b2c3",
  "session_id": "sess_0x8f2a1b3c",
  "content_hash": "e3b0c44298fc1c14...",
  "timestamp_start": "2026-03-27T14:30:00Z",
  "task": {
    "description": "Add input validation to the signup form",
    "repository": "acme/webapp",
    "base_commit": "a1b2c3d4"
  },
  "agent": {
    "name": "claude-code",
    "version": "1.0.32",
    "model": "anthropic/claude-sonnet-4-20250514"
  },
  "environment": {
    "os": "darwin",
    "shell": "zsh",
    "vcs": { "type": "git", "branch": "main" },
    "language_ecosystem": ["typescript"]
  },
  "system_prompts": {
    "abc123": "You are Claude Code..."
  },
  "steps": [
    {
      "step_index": 0,
      "role": "user",
      "content": "Add Zod validation to the signup form"
    },
    {
      "step_index": 1,
      "role": "agent",
      "content": "I'll add Zod validation...",
      "model": "anthropic/claude-sonnet-4-20250514",
      "system_prompt_hash": "abc123",
      "agent_role": "main",
      "call_type": "main",
      "tool_calls": [{
        "tool_call_id": "tc_001",
        "tool_name": "Edit",
        "input": { "file_path": "src/signup.tsx" },
        "duration_ms": 120
      }],
      "observations": [{
        "source_call_id": "tc_001",
        "output_summary": "Added Zod schema to signup form",
        "content": "File edited successfully"
      }],
      "token_usage": {
        "input_tokens": 4200,
        "output_tokens": 1800,
        "cache_read_tokens": 3800,
        "prefix_reuse_tokens": 3800
      }
    }
  ],
  "outcome": {
    "success": true,
    "signal_source": "deterministic",
    "signal_confidence": "derived",
    "committed": true,
    "commit_sha": "f5e6d7c8"
  },
  "metrics": {
    "total_steps": 2,
    "total_input_tokens": 8400,
    "total_output_tokens": 1800,
    "cache_hit_rate": 0.9,
    "estimated_cost_usd": 0.24
  },
  "security": { "tier": 2, "redactions_applied": 1 }
}