Skip to content

Personal AI, On Personal Devices

OpenJarvis is a research framework for composable, on-device AI systems. Build personal AI that runs on your hardware. Cloud APIs are optional.


Why OpenJarvis?

Personal AI agents are exploding in popularity, but nearly all of them still route intelligence through cloud APIs. Your "personal" AI continues to depend on someone else's server. At the same time, our Intelligence Per Watt research showed that local language models already handle 88.7% of single-turn chat and reasoning queries, with intelligence efficiency improving 5.3× from 2023 to 2025. The models and hardware are increasingly ready. What has been missing is the software stack to make local-first personal AI practical.

OpenJarvis is that stack. It is an opinionated framework for local-first personal AI, built around three core ideas: shared primitives for building on-device agents; evaluations that treat energy, FLOPs, latency, and dollar cost as first-class constraints alongside accuracy; and a learning loop that improves models using local trace data. The goal is simple: make it possible to build personal AI agents that run locally by default, calling the cloud only when truly necessary. OpenJarvis aims to be both a research platform and a production foundation for local AI, in the spirit of PyTorch.


Get Started

Run the full chat UI locally with one script:

git clone https://github.com/open-jarvis/OpenJarvis.git
cd OpenJarvis
./scripts/quickstart.sh

This installs dependencies, starts Ollama + a local model, launches the backend and frontend, and opens http://localhost:5173 in your browser.

The desktop app is a native window for the OpenJarvis UI. The backend (Ollama + inference) runs on your machine — start it first, then open the app.

Step 1. Start the backend:

git clone https://github.com/open-jarvis/OpenJarvis.git
cd OpenJarvis
./scripts/quickstart.sh

Step 2. Download and open the desktop app:

Download for macOS

Also available for Windows, Linux (DEB), and Linux (RPM). See the Downloads page for details.

The app connects to http://localhost:8000 automatically.

macOS: run xattr -cr /Applications/OpenJarvis.app if the app shows as \"damaged\".

from openjarvis import Jarvis

j = Jarvis()                              # auto-detect engine
response = j.ask("Explain quicksort.")
print(response)

For more control, use ask_full() to get usage stats, model info, and tool results:

result = j.ask_full(
    "What is 2 + 2?",
    agent="orchestrator",
    tools=["calculator"],
)
print(result["content"])       # "4"
print(result["tool_results"])  # [{tool_name: "calculator", ...}]
jarvis ask "What is the capital of France?"

jarvis ask --agent orchestrator --tools calculator "What is 137 * 42?"

jarvis serve --port 8000

jarvis memory index ./docs/
jarvis memory search "configuration options"

Five Primitives

  1. Intelligence — The LM: model catalog, generation defaults, quantization, preferred engine.
  2. Agents — The agentic harness: system prompt, tools, context, retry and exit logic. Seven agent types.
  3. ToolsMCP interface: web search, calculator, file I/O, code interpreter, retrieval, and any external MCP server.
  4. Engine — The inference runtime: Ollama, vLLM, SGLang, llama.cpp, cloud APIs. Same InferenceEngine ABC.
  5. Learning — Improvement loop: SFT weight updates, agent advisor, ICL updater. Trace-driven feedback.

Key Features

  • Five Composable Primitives


    Intelligence, Agents, Tools, Engine, and Learning — each with a clear ABC interface and decorator-based registry.

  • 5 Engine Backends


    Ollama, vLLM, SGLang, llama.cpp, and cloud (OpenAI/Anthropic/Google). Same InferenceEngine ABC.

  • Hardware-Aware


    Auto-detects GPU vendor, model, and VRAM. Recommends the optimal engine for your hardware.

  • Offline-First


    All core functionality works without a network connection. Cloud APIs are optional extras.

  • OpenAI-Compatible API


    jarvis serve starts a FastAPI server with SSE streaming. Drop-in replacement for OpenAI clients.

  • Trace-Driven Learning


    Every interaction is traced. The learning system improves models (SFT) and agents (prompt, tools, logic).


Documentation

  • Getting Started


    Install OpenJarvis, configure your first engine, and run your first query.

  • User Guide


    CLI, Python SDK, agents, memory, tools, telemetry, and benchmarks.

  • Architecture


    Five-primitive design, registry pattern, query flow, and cross-cutting learning.

  • API Reference


    Auto-generated reference for every module.

  • Deployment


    Docker, systemd, launchd. GPU-accelerated container images.

  • Development


    Contributing guide, extension patterns, roadmap, and changelog.

Sponsors

Laude InstituteStanford MarloweGoogle Cloud PlatformLambda LabsOllamaIBM ResearchStanford HAI