environment
environment
¶
RL environment for orchestrator training.
Adapted from IPW's environment.py. Uses OpenJarvis's
:class:~openjarvis.tools._stubs.ToolExecutor for real tool dispatch
(as opposed to IPW's cached-telemetry approach), making it suitable for
both training and evaluation.
Classes¶
OrchestratorEnvironment
¶
OrchestratorEnvironment(tools: List[BaseTool], max_turns: int = 10)
RL environment that executes tools via OpenJarvis ToolExecutor.
| PARAMETER | DESCRIPTION |
|---|---|
tools
|
List of :class:
TYPE:
|
max_turns
|
Maximum number of turns per episode.
TYPE:
|
Source code in src/openjarvis/learning/orchestrator/environment.py
Functions¶
reset
¶
reset(task: str) -> EpisodeState
Reset the environment for a new episode.
Args: task: The initial task/question.
Returns:
A fresh :class:EpisodeState.
Source code in src/openjarvis/learning/orchestrator/environment.py
step
¶
step(state: EpisodeState, action: OrchestratorAction) -> Tuple[EpisodeState, OrchestratorObservation]
Execute one step: dispatch the tool and observe the result.
Raises: ValueError: If the tool is not available or max turns exceeded.