Personal AI, On Personal Devices¶
OpenJarvis is a research framework for composable, on-device AI systems. Build personal AI that runs on your hardware. Cloud APIs are optional.
Why OpenJarvis?¶
Personal AI agents are exploding in popularity, but nearly all of them still route intelligence through cloud APIs. Your "personal" AI continues to depend on someone else's server. At the same time, our Intelligence Per Watt research showed that local language models already handle 88.7% of single-turn chat and reasoning queries, with intelligence efficiency improving 5.3× from 2023 to 2025. The models and hardware are increasingly ready. What has been missing is the software stack to make local-first personal AI practical.
OpenJarvis is that stack. It is an opinionated framework for local-first personal AI, built around three core ideas: shared primitives for building on-device agents; evaluations that treat energy, FLOPs, latency, and dollar cost as first-class constraints alongside accuracy; and a learning loop that improves models using local trace data. The goal is simple: make it possible to build personal AI agents that run locally by default, calling the cloud only when truly necessary. OpenJarvis aims to be both a research platform and a production foundation for local AI, in the spirit of PyTorch.
Get Started¶
Run the full chat UI locally with one script:
This installs dependencies, starts Ollama + a local model, launches the backend
and frontend, and opens http://localhost:5173 in your browser.
The desktop app is a native window for the OpenJarvis UI. The backend (Ollama + inference) runs on your machine — start it first, then open the app.
Step 1. Start the backend:
Step 2. Download and open the desktop app:
Also available for Windows, Linux (DEB), and Linux (RPM). See the Downloads page for details.
The app connects to http://localhost:8000 automatically.
macOS: run xattr -cr /Applications/OpenJarvis.app if the app shows as \"damaged\".
Five Primitives for Personal AI¶
OpenJarvis is built around five composable layers. Each has a clean interface and can be swapped independently.
- Intelligence — Pick a model, or let OpenJarvis pick one for your hardware. Manages the full catalog of local models across providers.
- Agents — Multi-step reasoning with tool use. Seven built-in agent types from simple chat to orchestrated workflows.
- Tools — Web search, calculator, file I/O, code interpreter, retrieval, and any external MCP server.
- Engine — The inference runtime: Ollama, vLLM, SGLang, llama.cpp, cloud APIs, and more. Auto-detects your hardware and recommends the best fit.
- Learning — Your AI gets better over time. Every interaction generates traces that drive automatic improvements to model weights, prompts, and agent behavior.
Key Features¶
-
10+ Engine Backends
Ollama, vLLM, SGLang, llama.cpp, MLX, Exo, LiteLLM, cloud (OpenAI/Anthropic/Google), and more. Same
InferenceEngineinterface, swap freely. -
Automated Workflows
Cron-based agents that monitor, summarize, and act. Code review, email triage, research digests — running 24/7 on your hardware.
-
Hardware-Aware
Auto-detects GPU vendor, model, and VRAM. Recommends the optimal engine for your hardware.
-
Offline-First
All core functionality works without a network connection. Cloud APIs are optional extras.
-
OpenAI-Compatible API
jarvis servestarts a FastAPI server with SSE streaming. Drop-in replacement for OpenAI clients. -
Energy & Cost Tracking
Built-in telemetry for GPU power draw, token costs, and latency. See exactly what each query costs in watts and dollars.
Documentation¶
-
Install OpenJarvis, configure your first engine, and run your first query.
-
CLI, Python SDK, agents, memory, tools, telemetry, and benchmarks.
-
Five-primitive design, registry pattern, query flow, and cross-cutting learning.
-
Auto-generated reference for every module.
-
Docker, systemd, launchd. GPU-accelerated container images.
-
Contributing guide, extension patterns, roadmap, and changelog.
Research¶
OpenJarvis is part of Intelligence Per Watt, a research initiative studying the efficiency of on-device AI systems. Developed at Hazy Research and the Scaling Intelligence Lab at Stanford SAIL.
Read the blog post for the full research motivation, architecture details, and experimental results.
Citation¶
@misc{saadfalcon2026openjarvis,
title={OpenJarvis: Personal AI, On Personal Devices},
author={Jon Saad-Falcon and Avanika Narayan and Herumb Shandilya and Hakki Orhun Akengin and Robby Manihani and Gabriel Bo and John Hennessy and Christopher R\'{e} and Azalia Mirhoseini},
year={2026},
howpublished={\url{https://scalingintelligence.stanford.edu/blogs/openjarvis/}},
}
Sponsors¶
Laude Institute • Stanford Marlowe • Google Cloud Platform • Lambda Labs • Ollama • IBM Research • Stanford HAI