CLI Reference¶

OpenJarvis provides a command-line interface through the jarvis command. Built on Click, it offers subcommands for querying models, managing memory, running benchmarks, and serving an OpenAI-compatible API.

Global Options¶

jarvis --version   # Print the OpenJarvis version
jarvis --help      # Show top-level help with all subcommands

`jarvis init`¶

Detect local hardware (CPU, GPU, RAM) and generate a configuration file at ~/.openjarvis/config.toml.

jarvis init           # Interactive — refuses to overwrite existing config
jarvis init --force   # Overwrite existing config without prompting

Option	Description
`--force`	Overwrite existing configuration without prompting

The init command auto-detects:

Platform (Linux, macOS, Windows)
CPU brand and core count
RAM in GB
GPU vendor, model, VRAM, and count (via nvidia-smi, rocm-smi, or system_profiler)

Based on the detected hardware, it recommends an appropriate inference engine and writes a pre-configured TOML file.

Example output:

Detecting hardware...
  Platform : linux
  CPU      : AMD Ryzen 9 7950X (32 cores)
  RAM      : 64 GB
  GPU      : NVIDIA RTX 4090 (24.0 GB VRAM, x1)

Config written successfully.

`jarvis ask`¶

Send a query to the inference engine (directly or through an agent) and print the response.

jarvis ask "What is the capital of France?"

Options¶

Option	Type	Default	Description
`-m`, `--model MODEL`	string	auto	Model to use for inference
`-e`, `--engine ENGINE`	string	auto	Engine backend (ollama, vllm, llamacpp, etc.)
`-t`, `--temperature TEMP`	float	`0.7`	Sampling temperature
`--max-tokens N`	int	`1024`	Maximum tokens to generate
`--json`	flag	off	Output raw JSON result instead of plain text
`--no-stream`	flag	off	Disable streaming (synchronous mode)
`--no-context`	flag	off	Disable memory context injection
`-a`, `--agent AGENT`	string	none	Agent to use (`simple`, `orchestrator`)
`--tools TOOLS`	string	none	Comma-separated tool names to enable

Direct Mode vs Agent Mode¶

Direct mode (default) sends the query straight to the inference engine:

jarvis ask "Explain quantum computing"

Agent mode routes the query through an agent that can use tools and manage multi-turn interactions:

jarvis ask --agent orchestrator "What is 2+2?"
jarvis ask --agent orchestrator --tools calculator,think "Calculate sqrt(144) + 3^2"
jarvis ask --agent simple "Hello"

Usage Examples¶

# Basic query
jarvis ask "What is machine learning?"

# Specify a model
jarvis ask -m qwen3:8b "Summarize this concept"

# Use the orchestrator agent with tools
jarvis ask --agent orchestrator --tools calculator "What is 15% of 340?"

# Get JSON output
jarvis ask --json "Hello"

# Disable memory context injection
jarvis ask --no-context "Tell me about Python"

# Set maximum token generation
jarvis ask --max-tokens 2048 "Write a detailed essay about AI"

JSON Output Format¶

When using --json in direct mode, the output includes:

{
  "content": "The response text...",
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 85,
    "total_tokens": 97
  }
}

When using --json in agent mode, the output includes:

{
  "content": "The response text...",
  "turns": 3,
  "tool_results": [
    {
      "tool_name": "calculator",
      "content": "51.0",
      "success": true
    }
  ]
}

`jarvis model`¶

Manage and inspect language models available on running engines.

`jarvis model list`¶

List all models available from running inference engines, displayed as a Rich table with model parameters, context length, and VRAM requirements.

jarvis model list

Example output:

           Available Models
┌─────────┬────────────────┬────────┬─────────┬──────┐
│ Engine  │ Model          │ Params │ Context │ VRAM │
├─────────┼────────────────┼────────┼─────────┼──────┤
│ ollama  │ qwen3:8b       │ 8B     │ 32,768  │ 6GB  │
│ ollama  │ llama3.2:3b    │ 3B     │ 8,192   │ 3GB  │
└─────────┴────────────────┴────────┴─────────┴──────┘

`jarvis model info <model>`¶

Show detailed information about a specific model.

jarvis model info qwen3:8b

Example output:

┌─ Qwen 3 8B ──────────────────────────────┐
│ Model ID:     qwen3:8b                    │
│ Name:         Qwen 3 8B                   │
│ Parameters:   8B                          │
│ Context:      32,768                      │
│ Quantization: none                        │
│ Min VRAM:     6GB                         │
│ Engines:      ollama, vllm                │
│ Provider:     Alibaba                     │
│ API Key:      not required                │
└───────────────────────────────────────────┘

`jarvis model pull <model>`¶

Download a model via Ollama. Shows a progress bar during download.

jarvis model pull qwen3:8b

Note

The pull command requires a running Ollama instance. It connects to the Ollama API at the host configured in your config.toml.

`jarvis memory`¶

Manage the document memory store for retrieval-augmented generation.

`jarvis memory index <path>`¶

Index documents from a file or directory into the memory store.

jarvis memory index ./docs/
jarvis memory index ./notes.md
jarvis memory index ./data/ --chunk-size 256 --chunk-overlap 32
jarvis memory index ./docs/ --backend sqlite

Option	Type	Default	Description
`--backend`, `-b`	string	config	Override the default memory backend
`--chunk-size`	int	`512`	Chunk size in tokens
`--chunk-overlap`	int	`64`	Overlap between chunks in tokens

The ingestion pipeline supports text, markdown, code files, and PDF (with pdfplumber installed). Binary files and hidden directories are automatically skipped.

`jarvis memory search <query>`¶

Search the memory store for relevant document chunks.

jarvis memory search "machine learning basics"
jarvis memory search -k 10 "neural networks"
jarvis memory search --backend faiss "embeddings"

Option	Type	Default	Description
`--top-k`, `-k`	int	`5`	Number of results to return
`--backend`, `-b`	string	config	Override the default memory backend

Results are displayed in a table with rank, score, source file, and a content preview.

`jarvis memory stats`¶

Show memory store statistics including document count and database size.

jarvis memory stats
jarvis memory stats --backend sqlite

Option	Type	Default	Description
`--backend`, `-b`	string	config	Override the default memory backend

`jarvis telemetry`¶

Query and manage inference telemetry data stored in SQLite.

`jarvis telemetry stats`¶

Show aggregated telemetry statistics including total calls, tokens, cost, and latency, broken down by model and engine.

jarvis telemetry stats
jarvis telemetry stats -n 5    # Show top 5 models

Option	Type	Default	Description
`-n`, `--top`	int	`10`	Number of top models to show

`jarvis telemetry export`¶

Export raw telemetry records in JSON or CSV format.

jarvis telemetry export                          # JSON to stdout
jarvis telemetry export --format csv             # CSV to stdout
jarvis telemetry export --format json -o data.json  # JSON to file
jarvis telemetry export -f csv -o metrics.csv    # CSV to file

Option	Type	Default	Description
`-f`, `--format`	choice	`json`	Output format: `json` or `csv`
`-o`, `--output`	path	stdout	Output file path

`jarvis telemetry clear`¶

Delete all telemetry records from the database.

jarvis telemetry clear         # Interactive confirmation
jarvis telemetry clear --yes   # Skip confirmation

Option	Type	Default	Description
`-y`, `--yes`	flag	off	Skip confirmation prompt

Warning

This permanently deletes all stored telemetry data. Use --yes to skip the confirmation prompt in automated scripts.

`jarvis bench`¶

Run inference benchmarks against a running engine.

`jarvis bench run`¶

Execute benchmarks and report results.

jarvis bench run                               # Run all benchmarks, 10 samples
jarvis bench run -n 20                         # 20 samples per benchmark
jarvis bench run -b latency                    # Only the latency benchmark
jarvis bench run -b throughput -n 50 --json    # Throughput, 50 samples, JSON output
jarvis bench run -o results.jsonl              # Write JSONL results to file
jarvis bench run -m qwen3:8b -e ollama         # Specific model and engine

Option	Type	Default	Description
`-m`, `--model MODEL`	string	auto	Model to benchmark
`-e`, `--engine ENGINE`	string	auto	Engine backend
`-n`, `--samples N`	int	`10`	Number of samples per benchmark
`-b`, `--benchmark NAME`	string	all	Specific benchmark to run
`-o`, `--output PATH`	path	none	Write JSONL results to file
`--json`	flag	off	Output JSON summary to stdout

Available benchmarks:

latency -- Measures per-call inference latency (mean, p50, p95, min, max)
throughput -- Measures tokens-per-second throughput

`jarvis channel`¶

Manage messaging channels for multi-platform communication. Channels connect directly to platform APIs (Telegram, Discord, Slack, etc.) -- no gateway required.

`jarvis channel list`¶

List registered channel backends and their connection status.

jarvis channel list

`jarvis channel send`¶

Send a message to a specific channel.

jarvis channel send slack "Hello from Jarvis!"
jarvis channel send discord "Build complete"

Argument	Type	Description
`TARGET`	string	Channel name to send to
`MESSAGE`	string	Message content

`jarvis channel status`¶

Show connection status for configured channels.

jarvis channel status

Channel Dependencies

Each channel requires its platform-specific credentials (bot tokens, API keys) configured in the [channel.<platform>] section of your config. See Configuration for details.

`jarvis serve`¶

Start an OpenAI-compatible API server.

jarvis serve                                 # Default host/port from config
jarvis serve --port 8000                     # Custom port
jarvis serve --host 0.0.0.0 --port 9000      # Bind to all interfaces
jarvis serve --model qwen3:8b                # Specify default model
jarvis serve --agent orchestrator            # Route requests through an agent

Option	Type	Default	Description
`--host HOST`	string	config	Bind address
`--port PORT`	int	config	Port number
`-e`, `--engine ENGINE`	string	auto	Engine backend
`-m`, `--model MODEL`	string	config	Default model for inference
`-a`, `--agent AGENT`	string	none	Agent for non-streaming requests

Server Dependencies

The serve command requires the server extra:

uv sync --extra server

This installs FastAPI, uvicorn, and related dependencies.

API Endpoints¶

The server exposes the following OpenAI-compatible endpoints:

Method	Path	Description
POST	`/v1/chat/completions`	Chat completions (streaming & non-streaming)
GET	`/v1/models`	List available models
GET	`/health`	Health check
GET	`/v1/channels`	List available messaging channels
POST	`/v1/channels/send`	Send a message to a channel
GET	`/v1/channels/status`	Channel bridge connection status

Example with curl:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3:8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

When an agent is configured (e.g., --agent orchestrator), non-streaming requests are routed through the agent with access to all registered tools. For tool-capable agents (orchestrator, react, openhands), all registered tools are automatically loaded and made available.

CLI Reference¶

Global Options¶

jarvis init¶

jarvis ask¶

Options¶

Direct Mode vs Agent Mode¶

Usage Examples¶

JSON Output Format¶

jarvis model¶

jarvis model list¶

jarvis model info <model>¶

jarvis model pull <model>¶

jarvis memory¶

jarvis memory index <path>¶

jarvis memory search <query>¶

jarvis memory stats¶

jarvis telemetry¶

jarvis telemetry stats¶

jarvis telemetry export¶

jarvis telemetry clear¶

jarvis bench¶

jarvis bench run¶

jarvis channel¶

jarvis channel list¶

jarvis channel send¶

jarvis channel status¶

jarvis serve¶

API Endpoints¶

`jarvis init`¶

`jarvis ask`¶

`jarvis model`¶

`jarvis model list`¶

`jarvis model info <model>`¶

`jarvis model pull <model>`¶

`jarvis memory`¶

`jarvis memory index <path>`¶

`jarvis memory search <query>`¶

`jarvis memory stats`¶

`jarvis telemetry`¶

`jarvis telemetry stats`¶

`jarvis telemetry export`¶

`jarvis telemetry clear`¶

`jarvis bench`¶

`jarvis bench run`¶

`jarvis channel`¶

`jarvis channel list`¶

`jarvis channel send`¶

`jarvis channel status`¶

`jarvis serve`¶