Skip to content

ipw_mixed

ipw_mixed

IPW mixed scorer -- LLM-as-judge for mixed-source evaluation datasets.

Since IPW records can originate from different source datasets, this scorer uses a general semantic comparison approach via an LLM judge (similar to the FRAMES scorer).

Classes

IPWMixedScorer

IPWMixedScorer(judge_backend: InferenceBackend, judge_model: str)

Bases: LLMJudgeScorer

LLM-as-judge evaluation for mixed-source IPW datasets.

Source code in src/openjarvis/evals/core/scorer.py
def __init__(self, judge_backend: InferenceBackend, judge_model: str) -> None:
    self._judge_backend = judge_backend
    self._judge_model = judge_model