Standards Alignment

opentraces sits at the intersection of four public standards. It adopts what works from each, and bridges the gap between trajectory (process) and attribution (output).

ATIF / Harbor (v1.6)

github.com/laude-institute/harbor

A training trajectory serialization format for agent research. Defines the step-based TAO (Thought-Action-Observation) loop, with fields for token IDs, logprobs, and reward signals designed for RL and SFT pipelines.

Relationship: opentraces is a superset of ATIF. We adopt the step-based model, role conventions (system | user | agent), and field patterns. We add attribution blocks, per-step token breakdowns, environment metadata, dependency tracking, and security metadata. The downstream field mappings live in packages/opentraces-schema/FIELD-MAPPINGS.md; serializers are internal building blocks for dataset workflows rather than a supported top-level client workflow. The ATIF serializer itself lives in src/opentraces/publish/atif.py and is invoked by dataset workflows; there is no opentraces export CLI verb.

ADP (Agent Data Protocol)

arxiv.org/abs/2410.10762

An interlingua for normalizing diverse agent trace formats into a common structure for training. Proposes a universal adapter layer so each dataset and each agent only needs one converter, O(D+A), instead of pairwise mappings, O(D*A).

Relationship: opentraces' adapter-based normalization follows the same pattern. Per-agent parsers are ADP-style adapters outputting the enriched schema.

Agent Trace (Cursor/community, v0.1.0 RFC)

github.com/cursor/agent-trace

A code attribution spec (CC BY 4.0) that records which lines of code came from which agent conversation, at file/line granularity. Backed by 10+ sponsors (Cloudflare, Vercel, Google Jules, Cognition).

Relationship: opentraces embeds Agent Trace attribution blocks directly in the trace record. Agent Trace focuses on output (code attribution), opentraces bridges that with process (trajectory).

Agent Trace RFCs adopted (schema 0.3.0)

RFC	Topic	Where it lands
#5	`original` pre-processing snapshot on divergent ranges	`AttributionRange.original`
#9	Provider-native conversation IDs	`AttributionConversation.ids`
#11	`change_type` on ranges	`AttributionRange.change_type`
#16	Baseline `related` resource vocabulary	`AttributionConversation.related`
#22	Canonical `repository_url`	`Task.repository_url`
#25	Lifecycle / revision-pinning	`TraceRecord.lifecycle`, `Attribution.revision`
#26	`unaccounted_files` for non-tool edits	`Attribution.unaccounted_files`
#27	Evidence-graded commit linking	`TraceRecord.git_links[]`, `GitLink.tier`

Adoption is additive — pre-0.3.0 traces validate unchanged.

The Agent Trace v0.1.0 serializer lives in src/opentraces/publish/agent_trace.py and can be called by dataset workflows that need to emit Agent Trace-shaped rows.

OTel GenAI Semantic Conventions

opentelemetry.io/docs/specs/semconv/gen-ai

OpenTelemetry's GenAI semantic conventions define standardized span attributes for LLM calls in observability pipelines, covering model names, token counts, and request metadata.

Relationship: opentraces' per-step token usage and model fields align with OTel GenAI conventions, enabling cross-referencing between observability spans and training trajectories.

The Core Insight

Agent Trace preserves which lines came from AI. ATIF/ADP preserve how the agent reasoned. Neither alone tells the complete story. opentraces connects the full conversation trajectory to the specific code output at line granularity.

Message Taxonomy

opentraces adopts a training-oriented message taxonomy:

Role	Description
`system`	System prompt (deduplicated by hash)
`user`	User message / prompt
`agent`	Agent response, tool calls, or thinking

Agent steps are further classified by call_type (main, subagent, warmup) and agent_role (main, explore, plan).

●HUMAN ○MACHINE