trace
trace
¶
Trace data model for agentic eval runs.
Classes¶
TurnTrace
dataclass
¶
TurnTrace(turn_index: int, input_tokens: int = 0, output_tokens: int = 0, tool_result_tokens: int = 0, tools_called: List[str] = list(), tool_latencies_s: Dict[str, float] = dict(), wall_clock_s: float = 0.0, error: Optional[str] = None, gpu_energy_joules: Optional[float] = None, cpu_energy_joules: Optional[float] = None, gpu_power_avg_watts: Optional[float] = None, cpu_power_avg_watts: Optional[float] = None, cost_usd: Optional[float] = None, action_energy_breakdown: Optional[List[Dict[str, Any]]] = None)
Per-turn telemetry data.
QueryTrace
dataclass
¶
QueryTrace(query_id: str, workload_type: str, query_text: str = '', response_text: str = '', turns: List[TurnTrace] = list(), total_wall_clock_s: float = 0.0, completed: bool = False, timed_out: bool = False, query_gpu_energy_joules: Optional[float] = None, query_cpu_energy_joules: Optional[float] = None, query_gpu_power_avg_watts: Optional[float] = None, query_cpu_power_avg_watts: Optional[float] = None, is_resolved: Optional[bool] = None, query_mbu_avg_pct: Optional[float] = None, query_mbu_max_pct: Optional[float] = None)
Per-query aggregate telemetry.
Attributes¶
avg_gpu_power_watts
property
¶
Mean GPU power across turns; falls back to query-level power.
avg_cpu_power_watts
property
¶
Mean CPU power across turns; falls back to query-level power.
throughput_tokens_per_sec
property
¶
Output tokens per second; None if zero tokens or zero time.
energy_per_token_joules
property
¶
GPU energy per output token; None if no energy data or zero tokens.
Functions¶
save_jsonl
¶
Append this trace as a JSONL line.
load_jsonl
classmethod
¶
load_jsonl(path: Path) -> List[QueryTrace]
Load traces from a JSONL file.
Source code in src/openjarvis/evals/core/trace.py
to_hf_dataset
staticmethod
¶
to_hf_dataset(traces: List[QueryTrace]) -> Any
Convert a list of QueryTrace objects to a HuggingFace Dataset.
Returns: A datasets.Dataset with one row per trace.