batch

batch ¶

Batch-level energy accounting — group requests and compute per-token energy.

Classes¶

BatchMetrics `dataclass` ¶

BatchMetrics(batch_id: str = '', total_requests: int = 0, total_tokens: int = 0, total_energy_joules: float = 0.0, energy_per_token_joules: float = 0.0, energy_per_request_joules: float = 0.0, mean_power_watts: float = 0.0, mean_throughput_tok_per_sec: float = 0.0, prefill_energy_joules: float = 0.0, decode_energy_joules: float = 0.0, per_request_energy: List[float] = list())

Aggregated metrics for a batch of inference requests.

EnergyBatch ¶

EnergyBatch(energy_monitor: Optional[Any] = None, batch_id: Optional[str] = None)

Group inference requests into a batch and compute per-token energy.

Works with or without an EnergyMonitor. When no monitor is provided, request counts are still tracked but energy values stay at zero.

Source code in src/openjarvis/telemetry/batch.py

def __init__(
    self,
    energy_monitor: Optional[Any] = None,
    batch_id: Optional[str] = None,
) -> None:
    self._monitor = energy_monitor
    self.batch_id = batch_id or str(uuid.uuid4())
    self.metrics: Optional[BatchMetrics] = None

Functions¶

sample ¶

sample() -> Generator[_BatchContext, None, None]

Wrap an energy monitor sample and provide a context for recording requests.

Yields a _BatchContext whose record_request() method should be called once per inference request inside the block.