Skip to content

skillorchestra

skillorchestra

SkillOrchestraAgent — inference-time router (Wang et al., 2026).

Paper: arXiv:2602.19672. The published pipeline is four phases — explore (run every pool model, collect traces), learn (induce a skill handbook with per-agent Beta competences and per-skill cost stats), select (Pareto- optimal handbook subset on a live val set), test. At deployment, the orchestrator reads the user query, infers skill demands, then picks the agent that maximizes weighted competence minus λ·cost.

What we reproduce here: the deployment-time step only. The full explore/learn/select pipeline requires multi-model serving + the FRAMES wiki retriever + a multi-hour LLM-driven learning loop that's out of scope for the OpenJarvis port (and was out of scope in the hybrid harness).

So this agent uses the orchestrator's inference logic with a small handbook that's synthesized per-task on the fly: cloud (Opus) reads the question, identifies which skills it needs (from a fixed catalog), assigns weights, scores each of our two agents (local Qwen-27B vs cloud Opus) under a cost-discounted weighted-competence rule, then routes. The chosen agent answers the question.

Hybrid harness result (n=30 GAIA): skillorchestra-gaia-qwen27b-opus-30 = 0.500 acc, $0.02/task — 30× cheaper than baseline-cloud (0.567 / $0.66) for ~7pp lower accuracy. Best cost-efficient GAIA paradigm by a wide margin.

Ported from hybrid-local-cloud-compute/adapters/skillorchestra_adapter.py.

Classes

SkillOrchestraAgent

SkillOrchestraAgent(engine: InferenceEngine, model: str, *, local_model: Optional[str] = None, local_endpoint: Optional[str] = None, cloud_endpoint: str = 'anthropic', cfg: Optional[Dict[str, Any]] = None, bus: Optional[Any] = None, temperature: Optional[float] = None, max_tokens: Optional[int] = None)

Bases: LocalCloudAgent

Inference-time skill-aware router. See module docstring.

Source code in src/openjarvis/agents/hybrid/_base.py
def __init__(
    self,
    engine: InferenceEngine,
    model: str,
    *,
    local_model: Optional[str] = None,
    local_endpoint: Optional[str] = None,
    cloud_endpoint: str = "anthropic",
    cfg: Optional[Dict[str, Any]] = None,
    bus: Optional[Any] = None,
    temperature: Optional[float] = None,
    max_tokens: Optional[int] = None,
) -> None:
    super().__init__(
        engine,
        model,
        bus=bus,
        temperature=temperature,
        max_tokens=max_tokens,
    )
    self._cloud_model = model
    self._cloud_endpoint = (cloud_endpoint or "anthropic").lower()
    self._local_model = local_model
    self._local_endpoint = local_endpoint
    self._cfg: Dict[str, Any] = dict(cfg or {})

Functions