toolcall15
toolcall15
¶
ToolCall-15 dataset provider — lightweight tool calling benchmark.
Provides 15 scenarios across 5 categories (3 per category) that test whether a model can call the right tool with the right arguments.
Reference: https://github.com/stevibe/ToolCall-15
Classes¶
ToolCall15Dataset
¶
Bases: DatasetProvider
ToolCall-15 tool calling benchmark.
Provides 15 scenarios across 5 categories that test whether a model can call the right tool with the right arguments. All tool outputs are pre-defined (mocked) per the benchmark specification.