Skip to content

supergpqa_mcq

supergpqa_mcq

SuperGPQA MCQ scorer — LLM-based letter extraction + exact match.

Adapted from IPW's mcq.py and gpqa.py evaluation handlers.

Classes

SuperGPQAScorer

SuperGPQAScorer(judge_backend: InferenceBackend, judge_model: str)

Bases: LLMJudgeScorer

Score SuperGPQA responses by extracting answer letter via LLM.

Source code in src/openjarvis/evals/core/scorer.py
def __init__(self, judge_backend: InferenceBackend, judge_model: str) -> None:
    self._judge_backend = judge_backend
    self._judge_model = judge_model