vllm_metrics
vllm_metrics
¶
vLLM Prometheus metrics scraper — fetches and parses /metrics endpoint.
Classes¶
VLLMMetrics
dataclass
¶
VLLMMetrics(ttft_p50: float = 0.0, ttft_p95: float = 0.0, ttft_p99: float = 0.0, gpu_cache_usage_pct: float = 0.0, e2e_latency_p50: float = 0.0, e2e_latency_p95: float = 0.0, queue_depth: float = 0.0)
Parsed vLLM performance metrics.
VLLMMetricsScraper
¶
Scrapes vLLM's Prometheus /metrics endpoint.
Source code in src/openjarvis/telemetry/vllm_metrics.py
Functions¶
scrape
¶
scrape() -> VLLMMetrics
Fetch and parse vLLM metrics. Returns zeroed metrics on error.