baseline_local
baseline_local
¶
BaselineLocalAgent — local-only reference for the hybrid ablation.
Mirror of :class:BaselineCloudAgent (baseline_cloud.py) but the entire
trajectory runs on the local vLLM model. No cloud teacher / router / advisor
is involved — this is the "what does the local model do by itself?" floor
in the n=100 ablation matrix.
On GAIA the agent makes one local call with the formatted prompt (which
already carries the FINAL ANSWER: reminder from
_prompts.format_gaia) and returns the text. On SWE-bench-Verified
the agent delegates to :func:run_swe_agent_loop with backbone="local"
so the model gets to run bash and read the repo — same wiring as the
mini-swe-agent cells but driven by the local model.
Construction args mirror :class:LocalCloudAgent. The local block in
the cell registry determines the local model + endpoint; cloud is
accepted for schema compatibility but unused (and cost_usd is always
0 — local inference is free).
Classes¶
BaselineLocalAgent
¶
BaselineLocalAgent(engine: InferenceEngine, model: str, *, local_model: Optional[str] = None, local_endpoint: Optional[str] = None, cloud_endpoint: str = 'anthropic', cfg: Optional[Dict[str, Any]] = None, bus: Optional[Any] = None, temperature: Optional[float] = None, max_tokens: Optional[int] = None)
Bases: LocalCloudAgent
Local-only baseline. cloud_* fields are ignored.
Configurable knobs via cfg:
local_max_tokens(int, default 4096): max_tokens per GAIA call and per turn of the SWE agent loop.local_temperature(float, default 0.0): sampling temperature for the local model.swe_use_agent_loop(bool, default True for SWE): if False the SWE branch falls back to a one-shot blind patch (not recommended; kept for parity with other agents).swe_max_turns(int, default 30): SWE-bench loop turn cap.swe_bash_timeout_s(int, default 120): bash timeout per turn.swe_turn_max_tokens(int, default 4096): max_tokens per agent turn inside the SWE loop. Falls back tolocal_max_tokens.