vllm_metrics

vllm_metrics ¶

vLLM Prometheus metrics scraper — fetches and parses /metrics endpoint.

Classes¶

VLLMMetrics `dataclass` ¶

VLLMMetrics(ttft_p50: float = 0.0, ttft_p95: float = 0.0, ttft_p99: float = 0.0, gpu_cache_usage_pct: float = 0.0, e2e_latency_p50: float = 0.0, e2e_latency_p95: float = 0.0, queue_depth: float = 0.0)

Parsed vLLM performance metrics.

VLLMMetricsScraper ¶

VLLMMetricsScraper(host: str = 'http://localhost:8000')

Scrapes vLLM's Prometheus /metrics endpoint.

Source code in src/openjarvis/telemetry/vllm_metrics.py

def __init__(self, host: str = "http://localhost:8000") -> None:
    self._host = host.rstrip("/")

Functions¶

scrape ¶

scrape() -> VLLMMetrics

Fetch and parse vLLM metrics. Returns zeroed metrics on error.