simpleqa_judge
simpleqa_judge
¶
SimpleQA scorer -- normalized exact match with LLM fallback.
Evaluates short factual answers using exact string matching (with normalization) and falls back to an LLM judge for semantic comparison.
Classes¶
SimpleQAScorer
¶
SimpleQAScorer(judge_backend: InferenceBackend, judge_model: str)
Bases: LLMJudgeScorer
SimpleQA evaluation: exact match with normalization + LLM fallback.
Source code in src/openjarvis/evals/core/scorer.py
Functions¶
exact_match
¶
Exact-match scorer with normalization for numbers and strings.