Skip to content

archon

archon

ArchonAgent — port of ScalingIntelligence/Archon.

Inference-time architecture search: layered (generator → ranker → fuser) sampling where a generator proposes K candidates, a ranker scores them, and a fuser synthesizes a final answer. Paper: arXiv:2409.15254.

How the hybrid harness wires it (and what we mirror here):

  • Local proposers (generator layer): K samples from vLLM via an OpenAI-compatible client at local_endpoint. Injected as a custom vllm_local model_type into Archon's GENERATE_MAP — that's the only way Ranker/Fuser can pick up custom backends (they re-instantiate Generator without custom_generators).
  • Cloud ranker + fuser: Archon's built-in OpenAI_API / Anthropic_API. Patched at import time to strip temperature for Opus 4.7+ and to tally token usage (Archon ignores usage by default).

cfg knobs:

  • n_samples (int, default 5) — K proposers
  • architecture (str, default "ensemble_rank_fuse")

  • "ensemble_rank_fuse" → [K local generators, 1 cloud ranker, 1 cloud fuser]

  • "single_local" → [1 local generator] (debug)

  • ranker_model / fuser_model (default: cloud_model for both)

  • max_tokens (default 2048), temperature (default 0.7)

Requires the Archon library (cloned at hybrid-local-cloud-compute/external/Archon — add its src to PYTHONPATH or pip-install editable). Import is lazy.

Ported from hybrid-local-cloud-compute/adapters/archon_adapter.py.

Classes

ArchonAgent

ArchonAgent(engine: InferenceEngine, model: str, *, local_model: Optional[str] = None, local_endpoint: Optional[str] = None, cloud_endpoint: str = 'anthropic', cfg: Optional[Dict[str, Any]] = None, bus: Optional[Any] = None, temperature: Optional[float] = None, max_tokens: Optional[int] = None)

Bases: LocalCloudAgent

Layered (generator → ranker → fuser) inference-time search.

Source code in src/openjarvis/agents/hybrid/_base.py
def __init__(
    self,
    engine: InferenceEngine,
    model: str,
    *,
    local_model: Optional[str] = None,
    local_endpoint: Optional[str] = None,
    cloud_endpoint: str = "anthropic",
    cfg: Optional[Dict[str, Any]] = None,
    bus: Optional[Any] = None,
    temperature: Optional[float] = None,
    max_tokens: Optional[int] = None,
) -> None:
    super().__init__(
        engine,
        model,
        bus=bus,
        temperature=temperature,
        max_tokens=max_tokens,
    )
    self._cloud_model = model
    self._cloud_endpoint = (cloud_endpoint or "anthropic").lower()
    self._local_model = local_model
    self._local_endpoint = local_endpoint
    self._cfg: Dict[str, Any] = dict(cfg or {})

Functions