Skip to content

coding_assistant

coding_assistant

coding_assistant dataset — 30 buggy code projects for agent-based debugging.

Each task presents a bug report, buggy source code, and a test suite. The agent must identify and fix the bug(s) so that all tests pass.

Difficulty tiers: - easy (10): single-line bugs — off-by-one, wrong operator, missing return - medium (10): multi-line logic bugs — incorrect algorithm, bad state management - hard (10): subtle bugs — race conditions, edge cases, incorrect rounding

Classes

CodingAssistantDataset

CodingAssistantDataset()

Bases: DatasetProvider

30 buggy code projects for agent-based debugging evaluation.

Source code in src/openjarvis/evals/datasets/coding_assistant.py
def __init__(self) -> None:
    self._records: List[EvalRecord] = []