Configuration¶
OpenJarvis uses a TOML configuration file to control engine selection, model identity, memory backends, agent behavior, and more. This page is the complete reference for every configuration option, organized by primitive.
Config File Location¶
The configuration file lives at:
OpenJarvis creates the ~/.openjarvis/ directory and populates it with a default config when you run jarvis init.
Generating Configuration¶
First-Time Setup¶
This command:
- Runs hardware auto-detection (GPU vendor/model/VRAM, CPU brand/cores, RAM)
- Selects the recommended engine based on your hardware
- Writes
~/.openjarvis/config.tomlwith sensible defaults
Regenerating Configuration¶
To overwrite an existing config:
Warning
--force overwrites your existing config file. Back up your config first if you have custom settings.
Configuration Sections¶
The config file is organized into TOML sections corresponding to the five primitives. Every field has a default value, so you only need to specify values you want to change.
[engine] — Inference Engine¶
Controls which inference engine is used and how each engine is reached. Engine settings are now nested under per-engine sub-sections instead of flat fields.
[engine]
default = "ollama"
[engine.ollama]
host = "http://localhost:11434"
[engine.vllm]
host = "http://localhost:8000"
[engine.sglang]
host = "http://localhost:30000"
# [engine.llamacpp]
# host = "http://localhost:8080"
# binary_path = ""
[engine] top-level:
| Field | Type | Default | Description |
|---|---|---|---|
default |
string | Auto-detected | Default engine backend. One of: ollama, vllm, llamacpp, sglang, cloud. Set automatically by jarvis init based on hardware detection. |
[engine.ollama]:
| Field | Type | Default | Description |
|---|---|---|---|
host |
string | http://localhost:11434 |
Base URL for the Ollama API server. |
[engine.vllm]:
| Field | Type | Default | Description |
|---|---|---|---|
host |
string | http://localhost:8000 |
Base URL for the vLLM OpenAI-compatible server. |
[engine.sglang]:
| Field | Type | Default | Description |
|---|---|---|---|
host |
string | http://localhost:30000 |
Base URL for the SGLang server. |
[engine.llamacpp]:
| Field | Type | Default | Description |
|---|---|---|---|
host |
string | http://localhost:8080 |
Base URL for the llama.cpp HTTP server (llama-server). |
binary_path |
string | "" |
Path to the llama.cpp binary, if not on $PATH. |
Engine fallback
If the configured default engine is unreachable, OpenJarvis automatically probes all registered engines and falls back to any healthy one.
Backward compatibility
The old flat field names (ollama_host, vllm_host, llamacpp_host, llamacpp_path, sglang_host) are still accepted as backward-compatible properties. New configurations should use the nested sub-section format.
[intelligence] — Model Identity and Generation Defaults¶
Controls which model is used, its weight paths, quantization, and the default sampling parameters for generation. Generation parameters such as temperature and max_tokens now live here rather than under [agent].
[intelligence]
default_model = ""
fallback_model = ""
# model_path = ""
# checkpoint_path = ""
# quantization = "none"
# preferred_engine = ""
# provider = ""
temperature = 0.7
max_tokens = 1024
# top_p = 0.9
# top_k = 40
# repetition_penalty = 1.0
# stop_sequences = ""
Model identity fields:
| Field | Type | Default | Description |
|---|---|---|---|
default_model |
string | "" |
Preferred model identifier (e.g., qwen3:8b). When empty, the router policy selects the model dynamically. |
fallback_model |
string | "" |
Model to use if the default is unavailable. |
model_path |
string | "" |
Path or HuggingFace repo ID for local weights (e.g., "./models/qwen3-8b.gguf" or "Qwen/Qwen3-8B"). |
checkpoint_path |
string | "" |
Path to a fine-tuned checkpoint or LoRA adapter directory. |
quantization |
string | "none" |
Quantization format. Accepted values: none, fp8, int8, int4, gguf_q4, gguf_q8. |
preferred_engine |
string | "" |
Override engine for this model (e.g., "vllm"). Takes priority over engine.default. |
provider |
string | "" |
Model provider hint: local, openai, anthropic, google. Used by the Cloud engine to route API calls. |
Generation default fields (overridable per-call):
| Field | Type | Default | Description |
|---|---|---|---|
temperature |
float | 0.7 |
Sampling temperature. Lower values produce more deterministic output. |
max_tokens |
int | 1024 |
Maximum number of tokens to generate per call. |
top_p |
float | 0.9 |
Nucleus sampling probability mass. |
top_k |
int | 40 |
Top-k sampling: only consider the top-k tokens at each step. |
repetition_penalty |
float | 1.0 |
Penalize repeated tokens. Values > 1 reduce repetition. |
stop_sequences |
string | "" |
Comma-separated stop strings. Generation halts when any stop string is produced. |
When both default_model and fallback_model are empty, OpenJarvis uses the configured router policy (see [learning]) to select a model from those available on the active engine.
Engine Selection Priority¶
When resolving which engine to use for a model, SystemBuilder, sdk.py, and cli/ask.py check fields in this order:
1. Explicit --engine CLI flag or engine_key= SDK parameter
2. config.intelligence.preferred_engine
3. config.engine.default
4. First healthy engine discovered at runtime
This lets you pin a specific model to a specific engine without changing the global engine default:
[engine]
default = "ollama"
[intelligence]
default_model = "llama3.2:3b"
model_path = "./models/llama-3.2-3b.Q4_K_M.gguf"
quantization = "gguf_q4"
preferred_engine = "llamacpp"
[agent] — Agent Behavior¶
Controls the default agent, turn limits, tool selection, system prompt, and memory context injection.
[agent]
default_agent = "simple"
max_turns = 10
# tools = ""
# objective = ""
# system_prompt = ""
# system_prompt_path = ""
context_from_memory = true
| Field | Type | Default | Description |
|---|---|---|---|
default_agent |
string | "simple" |
Default agent to use. Available: simple, orchestrator, react, operative, monitor_operative. |
max_turns |
int | 10 |
Maximum number of tool-calling turns for the orchestrator agent before it must produce a final answer. |
tools |
string | "" |
Comma-separated list of tools to enable by default (e.g., "calculator,think"). |
objective |
string | "" |
Concise purpose string for routing, learning, and documentation. |
system_prompt |
string | "" |
Inline system prompt. Takes precedence over system_prompt_path when set. |
system_prompt_path |
string | "" |
Path to a system prompt file (.txt or .md). |
context_from_memory |
bool | true |
Whether to automatically inject relevant memory context into queries. |
Generation parameters moved
temperature and max_tokens have moved from [agent] to [intelligence]. Old configs with these fields under [agent] are automatically migrated to [intelligence] at load time.
Backward compatibility
The old field name default_tools is still accepted as a backward-compatible property for tools. New configurations should use tools.
Context injection
When context_from_memory = true and documents have been indexed, every query automatically searches memory for relevant chunks and prepends them as system context. This gives the model access to your indexed knowledge base without any extra steps. Disable with --no-context on the CLI or context=False in the SDK.
[learning] — Learning Policies¶
Controls whether the learning system is enabled and configures per-primitive policies through nested sub-sections.
[learning]
enabled = false
update_interval = 100
# auto_update = false
[learning.routing]
policy = "heuristic"
# min_samples = 5
# [learning.intelligence]
# policy = "none"
# [learning.agent]
# policy = "none"
# [learning.metrics]
# accuracy_weight = 0.6
# latency_weight = 0.2
# cost_weight = 0.1
# efficiency_weight = 0.1
[learning] top-level:
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Whether the learning system is active. |
update_interval |
int | 100 |
Number of traces between automatic policy updates. |
auto_update |
bool | false |
Whether to trigger policy updates automatically when the interval is reached. |
[learning.routing] — Router policy:
| Field | Type | Default | Description |
|---|---|---|---|
policy |
string | "heuristic" |
Router policy for model selection. Available: heuristic, learned (trace-driven), sft (supervised fine-tuning), grpo (RL stub). |
min_samples |
int | 5 |
Minimum number of traces required before trusting a learned routing decision. |
[learning.intelligence] — Intelligence learning policy:
| Field | Type | Default | Description |
|---|---|---|---|
policy |
string | "none" |
Intelligence learning policy. Available: none, sft. Use sft to learn model routing from accumulated traces. |
[learning.agent] — Agent learning policy:
| Field | Type | Default | Description |
|---|---|---|---|
policy |
string | "none" |
Agent learning policy. Available: none, agent_advisor, icl_updater. |
max_icl_examples |
int | 20 |
Maximum number of in-context examples to maintain in the ICL example library. |
advisor_confidence_threshold |
float | 0.7 |
Minimum confidence score for the advisor to recommend a strategy change. |
[learning.metrics] — Reward / optimization metric weights:
| Field | Type | Default | Description |
|---|---|---|---|
accuracy_weight |
float | 0.6 |
Weight for outcome accuracy in the composite reward score. |
latency_weight |
float | 0.2 |
Weight for inference latency in the composite reward score. |
cost_weight |
float | 0.1 |
Weight for per-call cost in the composite reward score. |
efficiency_weight |
float | 0.1 |
Weight for token efficiency in the composite reward score. |
Router policies:
| Policy | Description |
|---|---|
heuristic |
Rule-based selection using 6 priority rules. Considers model availability, parameter count, context length, and query characteristics. Default. |
learned |
Trace-driven policy that learns from past interaction outcomes stored in the trace system. |
sft |
Supervised fine-tuning policy that learns routing from labeled trace data. |
grpo |
Group Relative Policy Optimization stub for future RL-based routing. |
Agent policies:
| Policy | Description |
|---|---|
agent_advisor |
Advises on agent strategy (tool sets, turn limits) based on trace patterns. |
icl_updater |
In-context learning updater — discovers reusable ICL examples and multi-tool skill sequences from traces. |
You can also override the router policy per-query via the CLI:
Backward compatibility
The old flat field names default_policy, intelligence_policy, agent_policy, and the comma-separated reward_weights string are still accepted as backward-compatible properties. New configurations should use the nested sub-section format. The tools_policy field has been removed; use learning.agent.policy = "icl_updater" instead.
[tools.storage] — Storage Backend¶
Controls the storage backend used for document memory and context injection. The context_injection field has moved to agent.context_from_memory.
[tools.storage]
default_backend = "sqlite"
db_path = "~/.openjarvis/memory.db"
context_top_k = 5
context_min_score = 0.1
context_max_tokens = 2048
chunk_size = 512
chunk_overlap = 64
| Field | Type | Default | Description |
|---|---|---|---|
default_backend |
string | "sqlite" |
Storage backend. Available: sqlite (FTS5), faiss, colbert, bm25, hybrid. |
db_path |
string | ~/.openjarvis/memory.db |
Path to the SQLite memory database. Used by the sqlite backend. |
context_top_k |
int | 5 |
Number of top memory results to inject as context. |
context_min_score |
float | 0.1 |
Minimum relevance score for a memory result to be included in context. |
context_max_tokens |
int | 2048 |
Maximum number of tokens to use for injected context. |
chunk_size |
int | 512 |
Size of document chunks (in tokens) when indexing documents. |
chunk_overlap |
int | 64 |
Overlap between adjacent chunks (in tokens) when indexing. |
Memory backends:
| Backend | Extra Required | Description |
|---|---|---|
sqlite |
None | SQLite with FTS5 full-text search. Zero dependencies. Default. |
faiss |
memory-faiss |
Facebook AI Similarity Search with sentence-transformer embeddings. |
colbert |
memory-colbert |
ColBERTv2 late-interaction retrieval. Requires PyTorch. |
bm25 |
memory-bm25 |
BM25 sparse retrieval via rank-bm25. |
hybrid |
Depends on sub-backends | Reciprocal Rank Fusion combining multiple backends. |
Backward compatibility
The [memory] TOML section is still supported and maps to [tools.storage]. New configurations should use [tools.storage]. The context_injection field under [memory] or [tools.storage] is automatically migrated to agent.context_from_memory at load time.
[tools.mcp] — MCP (Model Context Protocol)¶
Controls the MCP server and external MCP tool provider integration. The MCP adapter supports protocol version 2025-11-25.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | true |
Whether to enable the MCP adapter for exposing and consuming tools via MCP. |
servers |
string | "" |
JSON-encoded list of external MCP server configuration objects. |
[server] — API Server¶
Controls the OpenAI-compatible API server started by jarvis serve.
| Field | Type | Default | Description |
|---|---|---|---|
host |
string | "0.0.0.0" |
Bind address for the server. Use "127.0.0.1" to restrict to localhost. |
port |
int | 8000 |
Port number for the server. |
agent |
string | "orchestrator" |
Agent to use for chat completion requests. |
model |
string | "" |
Default model for the server. When empty, uses intelligence.default_model or the first available model. |
workers |
int | 1 |
Number of uvicorn worker processes. |
CLI options override config values:
[telemetry] — Telemetry Persistence¶
Controls whether inference telemetry is recorded and where it is stored.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | true |
Whether to record telemetry for each inference call. Records timing, token counts, model, engine, and cost. |
db_path |
string | ~/.openjarvis/telemetry.db |
Path to the SQLite telemetry database. |
Telemetry is local-only
All telemetry data is stored locally in a SQLite database. No data is ever sent to external services.
[traces] — Trace Recording¶
Controls the trace system that records full interaction sequences for the learning system.
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Whether to record traces for each agent interaction. |
db_path |
string | ~/.openjarvis/traces.db |
Path to the SQLite trace database. |
[channel] — Channel Messaging¶
Controls the channel messaging bridge for multi-platform communication. Each supported platform has its own nested sub-section.
[channel]
enabled = false
default_channel = ""
default_agent = "simple"
# [channel.telegram]
# bot_token = ""
# [channel.discord]
# bot_token = ""
# [channel.slack]
# bot_token = ""
# app_token = ""
# [channel.webhook]
# url = ""
# secret = ""
# method = "POST"
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | false |
Whether to enable channel messaging support. |
default_channel |
string | "" |
Default channel to use when not specified. |
default_agent |
string | "simple" |
Default agent for handling channel messages. |
[security] — Security Guardrails¶
Controls the security scanning pipeline for input/output content.
[security]
enabled = true
mode = "warn"
scan_input = true
scan_output = true
secret_scanner = true
pii_scanner = true
enforce_tool_confirmation = true
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool | true |
Whether to enable security guardrails. |
mode |
string | "warn" |
Action on findings: "warn" (log only), "redact" (replace sensitive content), or "block" (raise error). |
scan_input |
bool | true |
Whether to scan user input messages. |
scan_output |
bool | true |
Whether to scan model output. |
secret_scanner |
bool | true |
Enable secret detection (API keys, tokens, passwords). |
pii_scanner |
bool | true |
Enable PII detection (emails, SSNs, credit cards). |
enforce_tool_confirmation |
bool | true |
Require confirmation before executing tools. |
Choosing a security mode
Use "warn" during development to see what would be flagged without disrupting output.
Use "redact" in production to automatically sanitize sensitive content.
Use "block" for strict environments where any sensitive data should halt generation.
Hardware Auto-Detection¶
When you run jarvis init, OpenJarvis probes your system to detect available hardware. The detection runs in this order:
GPU Detection¶
-
NVIDIA GPU — Checks for
nvidia-smion$PATH. If found, queries GPU name, VRAM (in MB), and GPU count via: -
AMD GPU — Checks for
rocm-smion$PATH. If found, queries the product name via: -
Apple Silicon — On macOS only. Runs
system_profiler SPDisplaysDataTypeand looks for "Apple" in the chipset model line.
If none of these detect a GPU, the system is treated as CPU-only.
CPU and RAM Detection¶
- CPU brand: Reads from
sysctl -n machdep.cpu.brand_stringon macOS, or parsesmodel namefrom/proc/cpuinfoon Linux. - CPU count: Uses Python's
os.cpu_count(). - RAM: Reads from
sysctl -n hw.memsizeon macOS, or parsesMemTotalfrom/proc/meminfoon Linux.
Detected Hardware Dataclass¶
The detection result is stored as a HardwareInfo dataclass:
@dataclass
class HardwareInfo:
platform: str # "linux", "darwin", "windows"
cpu_brand: str # e.g., "AMD EPYC 7763"
cpu_count: int # e.g., 128
ram_gb: float # e.g., 512.0
gpu: GpuInfo | None
@dataclass
class GpuInfo:
vendor: str # "nvidia", "amd", "apple"
name: str # e.g., "NVIDIA A100-SXM4-80GB"
vram_gb: float # e.g., 80.0
compute_capability: str # (NVIDIA only)
count: int # e.g., 8
Engine Recommendation Logic¶
Based on the detected hardware, recommend_engine() selects the optimal default engine:
graph TD
A[detect_hardware] --> B{GPU detected?}
B -->|No| C[llamacpp]
B -->|Yes| D{GPU vendor?}
D -->|Apple| E[ollama]
D -->|NVIDIA| F{Datacenter GPU?}
D -->|AMD| G[vllm]
F -->|Yes: A100, H100, H200, L40, A10, A30| H[vllm]
F -->|No: consumer GPU| I[ollama]
| Hardware | Recommended Engine | Reason |
|---|---|---|
| No GPU | llamacpp |
Efficient CPU inference with GGUF quantized models |
| Apple Silicon | ollama |
Native Metal acceleration, easy model management |
| NVIDIA consumer GPU (RTX 3090, 4090, etc.) | ollama |
Simple setup, good performance for single-user |
| NVIDIA datacenter GPU (A100, H100, H200, L40, A10, A30) | vllm |
High-throughput batched serving, continuous batching |
| AMD GPU | vllm |
ROCm support via vLLM |
Example Configurations¶
Apple Silicon Mac¶
# ~/.openjarvis/config.toml
# Apple Silicon MacBook Pro (M3 Max, 128 GB unified memory)
[engine]
default = "ollama"
[engine.ollama]
host = "http://localhost:11434"
[intelligence]
default_model = "qwen3:8b"
fallback_model = "llama3.2:3b"
temperature = 0.7
max_tokens = 1024
[agent]
default_agent = "simple"
max_turns = 10
context_from_memory = true
[tools.storage]
default_backend = "sqlite"
[server]
host = "127.0.0.1"
port = 8000
agent = "orchestrator"
[learning]
enabled = false
[learning.routing]
policy = "heuristic"
[telemetry]
enabled = true
NVIDIA Datacenter (Multi-GPU)¶
# ~/.openjarvis/config.toml
# 8x NVIDIA A100 80GB server
[engine]
default = "vllm"
[engine.vllm]
host = "http://localhost:8000"
[engine.ollama]
host = "http://localhost:11434"
[intelligence]
default_model = "Qwen/Qwen2.5-72B-Instruct"
fallback_model = "Qwen/Qwen2.5-7B-Instruct"
temperature = 0.5
max_tokens = 4096
[agent]
default_agent = "orchestrator"
max_turns = 15
tools = "calculator,think,retrieval"
context_from_memory = true
[tools.storage]
default_backend = "faiss"
context_top_k = 10
context_min_score = 0.05
context_max_tokens = 4096
chunk_size = 1024
chunk_overlap = 128
[server]
host = "0.0.0.0"
port = 8000
agent = "orchestrator"
model = "Qwen/Qwen2.5-72B-Instruct"
workers = 1
[learning]
enabled = false
[learning.routing]
policy = "heuristic"
[telemetry]
enabled = true
CPU-Only (No GPU)¶
# ~/.openjarvis/config.toml
# CPU-only machine
[engine]
default = "llamacpp"
[engine.llamacpp]
host = "http://localhost:8080"
[intelligence]
default_model = ""
fallback_model = ""
temperature = 0.7
max_tokens = 512
[agent]
default_agent = "simple"
max_turns = 5
context_from_memory = true
[tools.storage]
default_backend = "sqlite"
context_top_k = 3
context_max_tokens = 1024
chunk_size = 256
chunk_overlap = 32
[server]
host = "127.0.0.1"
port = 8000
[learning]
enabled = false
[learning.routing]
policy = "heuristic"
[telemetry]
enabled = true
Trace-Driven Learning Enabled¶
# ~/.openjarvis/config.toml
# Research setup with trace-driven learning active
[engine]
default = "ollama"
[engine.ollama]
host = "http://localhost:11434"
[intelligence]
default_model = "qwen3:8b"
temperature = 0.7
max_tokens = 1024
[agent]
default_agent = "orchestrator"
max_turns = 10
context_from_memory = true
[tools.storage]
default_backend = "sqlite"
[learning]
enabled = true
update_interval = 50
auto_update = true
[learning.routing]
policy = "learned"
min_samples = 10
[learning.intelligence]
policy = "sft"
[learning.agent]
policy = "agent_advisor"
advisor_confidence_threshold = 0.8
[learning.metrics]
accuracy_weight = 0.6
latency_weight = 0.2
cost_weight = 0.1
efficiency_weight = 0.1
[traces]
enabled = true
[telemetry]
enabled = true
Migration Guide¶
If you have an existing ~/.openjarvis/config.toml from a previous version, here is what changed and how to update it.
Engine: Nested Sub-Sections¶
Note
The old flat names still work as backward-compatible properties. You only need to update your config if you want to use the new fields (e.g., binary_path).
Intelligence: Generation Parameters¶
Note
Old configs with temperature or max_tokens under [agent] are automatically migrated to [intelligence] at load time. No manual update is required, but updating is recommended for clarity.
Agent: Renamed and Added Fields¶
The default_tools name still works via a backward-compatible property.
Memory: Context Injection Moved¶
Note
context_injection under [memory] or [tools.storage] is automatically migrated to agent.context_from_memory at load time.
Learning: Nested Sub-Sections¶
Note
The flat field names default_policy, intelligence_policy, agent_policy, and reward_weights are still accepted as backward-compatible properties. The tools_policy field has been removed; use learning.agent.policy = "icl_updater" instead.
Programmatic Configuration¶
You can configure OpenJarvis entirely from Python without a TOML file:
from openjarvis import Jarvis
from openjarvis.core.config import (
AgentConfig,
EngineConfig,
IntelligenceConfig,
JarvisConfig,
LearningConfig,
OllamaEngineConfig,
StorageConfig,
ToolsConfig,
)
config = JarvisConfig(
engine=EngineConfig(
default="ollama",
ollama=OllamaEngineConfig(host="http://my-server:11434"),
),
intelligence=IntelligenceConfig(
default_model="qwen3:8b",
temperature=0.7,
max_tokens=2048,
),
agent=AgentConfig(
default_agent="orchestrator",
max_turns=15,
context_from_memory=True,
),
tools=ToolsConfig(
storage=StorageConfig(
default_backend="sqlite",
context_top_k=10,
),
),
)
j = Jarvis(config=config)
response = j.ask("Hello")
j.close()
Or load from a custom path:
Environment Variables¶
OpenJarvis respects the following environment variables:
| Variable | Description |
|---|---|
OPENAI_API_KEY |
API key for OpenAI cloud inference. Required for the cloud engine with OpenAI models. |
ANTHROPIC_API_KEY |
API key for Anthropic cloud inference. Required for the cloud engine with Claude models. |
GOOGLE_API_KEY |
API key for Google Gemini inference. Required for the google engine. |
TAVILY_API_KEY |
API key for the Tavily web search tool. Required for the web_search tool. |
Next Steps¶
- Quick Start — Run your first query
- CLI Reference — Full reference for all CLI commands
- Architecture Overview — Understand how the pieces fit together
- Intelligence Primitive — Model identity and generation defaults
- Learning & Traces — Router policies and the trace-driven feedback loop