Skip to content

liveresearchbench

liveresearchbench

LiveResearchBench (Salesforce) scorer — checklist-based evaluation.

Evaluates research reports using per-question checklists for coverage, plus LLM-as-judge for presentation quality and citation adequacy.

Reference: https://github.com/SalesforceAIResearch/LiveResearchBench

Classes

LiveResearchBenchSFScorer

LiveResearchBenchSFScorer(judge_backend: InferenceBackend, judge_model: str)

Bases: LLMJudgeScorer

Checklist + quality scorer for Salesforce LiveResearchBench.

Source code in src/openjarvis/evals/core/scorer.py
def __init__(self, judge_backend: InferenceBackend, judge_model: str) -> None:
    self._judge_backend = judge_backend
    self._judge_model = judge_model