liveresearchbench
liveresearchbench
¶
LiveResearchBench (Salesforce) dataset provider.
80 expert-curated deep research tasks with per-question evaluation checklists across three domains: daily life, enterprise, and academia. 543 checklist items total (grouped by question).
Reference: https://github.com/SalesforceAIResearch/LiveResearchBench HuggingFace: Salesforce/LiveResearchBench (gated — accept terms first)
Classes¶
LiveResearchBenchSFDataset
¶
Bases: DatasetProvider
Salesforce LiveResearchBench — 80 expert-curated research tasks.
The HuggingFace dataset has 543 rows (multiple checklist items per
question). We group by qid to produce one EvalRecord per unique
question, with all checklist items aggregated in metadata.