Index
core
¶
Core module — registries, types, configuration, and event bus.
Classes¶
AgentRegistry
¶
EngineRegistry
¶
MemoryRegistry
¶
ModelRegistry
¶
ToolRegistry
¶
Conversation
dataclass
¶
Conversation(messages: List[Message] = list(), max_messages: Optional[int] = None)
Message
dataclass
¶
Message(role: Role, content: str = '', name: Optional[str] = None, tool_calls: Optional[List[ToolCall]] = None, tool_call_id: Optional[str] = None, metadata: Dict[str, Any] = dict())
A single chat message (OpenAI-compatible structure).
ModelSpec
dataclass
¶
ModelSpec(model_id: str, name: str, parameter_count_b: float, context_length: int, active_parameter_count_b: Optional[float] = None, quantization: Quantization = NONE, min_vram_gb: float = 0.0, supported_engines: Sequence[str] = (), provider: str = '', requires_api_key: bool = False, metadata: Dict[str, Any] = dict())
Metadata describing a language model.
Quantization
¶
Bases: str, Enum
Model quantization formats.
Role
¶
Bases: str, Enum
Chat message roles (OpenAI-compatible).
TelemetryRecord
dataclass
¶
TelemetryRecord(timestamp: float, model_id: str, prompt_tokens: int = 0, completion_tokens: int = 0, total_tokens: int = 0, latency_seconds: float = 0.0, ttft: float = 0.0, cost_usd: float = 0.0, energy_joules: float = 0.0, power_watts: float = 0.0, gpu_utilization_pct: float = 0.0, gpu_memory_used_gb: float = 0.0, gpu_temperature_c: float = 0.0, throughput_tok_per_sec: float = 0.0, energy_per_output_token_joules: float = 0.0, throughput_per_watt: float = 0.0, prefill_latency_seconds: float = 0.0, decode_latency_seconds: float = 0.0, prefill_energy_joules: float = 0.0, decode_energy_joules: float = 0.0, mean_itl_ms: float = 0.0, median_itl_ms: float = 0.0, p90_itl_ms: float = 0.0, p95_itl_ms: float = 0.0, p99_itl_ms: float = 0.0, std_itl_ms: float = 0.0, is_streaming: bool = False, engine: str = '', agent: str = '', energy_method: str = '', energy_vendor: str = '', batch_id: str = '', is_warmup: bool = False, cpu_energy_joules: float = 0.0, gpu_energy_joules: float = 0.0, dram_energy_joules: float = 0.0, tokens_per_joule: float = 0.0, metadata: Dict[str, Any] = dict())
Single telemetry observation recorded after an inference call.
ToolCall
dataclass
¶
A single tool invocation request embedded in an assistant message.
ToolResult
dataclass
¶
ToolResult(tool_name: str, content: str, success: bool = True, usage: Dict[str, Any] = dict(), cost_usd: float = 0.0, latency_seconds: float = 0.0, metadata: Dict[str, Any] = dict())
Result returned by a tool invocation.