Roadmap¶

Current Focus Areas¶

These are the areas where active development is happening and contributions are most impactful:

Post-training data — building datasets and training pipelines from execution traces to improve agent routing and tool selection
Multi-model orchestration pipelines — coordinating multiple models within a single query (e.g., small model for classification, large model for generation)
Energy-aware routing — using power consumption data from telemetry to optimize for energy efficiency alongside latency and quality
Plugin ecosystem — community-contributed engines, tools, and agents distributed as Python packages
Federated memory — memory backends that synchronize across devices

How to Get Involved¶

Browse the workstreams below for an item that interests you
Check if a GitHub issue already exists for it — if not, open one
Comment "take" on the issue to get auto-assigned
Read the Contributing Guide for development setup and PR process

Workstreams¶

OpenJarvis development is organized into five independent workstreams. Contributors can pick any track that matches their skills and interests — workstreams are designed to be worked on in parallel without blocking each other.

Every item carries a maturity tag:

Tag	Meaning	Contributor guidance
Ready	Well-scoped, implementation path is clear	Pick it up — check issues for a spec or write one
Design Needed	Concept is clear but needs a spec before code	Start a design discussion or draft an RFC
Research-Stage	Exploratory, needs investigation before designing	Read the relevant papers, prototype, share findings

Workstream 1: Continuous Operators & Agents¶

Operators are OpenJarvis's key differentiator — persistent, scheduled, stateful agents that run autonomously on personal devices. The current tick-based architecture (OperatorManager → TaskScheduler → AgentExecutor → OperativeAgent) is solid but needs hardening for truly long-horizon autonomy.

Where you can help¶

Item	Maturity	Details
Operator health checks & heartbeat monitoring	Ready	Add liveness probes to OperatorManager; surface in `jarvis operators status`. Detect stalled operators beyond the existing reconciliation loop.
Metrics collection for operator manifests	Ready	The `metrics` field exists in `OperatorManifest` but is not collected. Wire it to telemetry. Good first issue.
Capability policy enforcement	Ready	`required_capabilities` field exists in manifests but is not enforced. Connect to the existing RBAC `CapabilityPolicy` system. Good first issue.
Rate limiting per operator	Ready	Prevent runaway operators from hammering inference. Add configurable rate limits to OperatorManager.
Operator composition / chaining	Design Needed	Express dependencies between operators (operator A feeds results to operator B). Requires design for data passing and scheduling semantics.
Event-driven operators	Design Needed	Operators that trigger on EventBus events (e.g., new file indexed, channel message received) rather than only cron/interval schedules.
Operator versioning & rollback	Design Needed	Run v2 of an operator alongside v1. Roll back automatically on repeated failures.
Self-improving operators via Learning	Research-Stage	Operators that use trace feedback to tune their own prompts, tool selection, and routing policies through the Learning primitive.

Workstream 2: Mobile & Messaging Clients¶

Personal AI must be accessible from the devices people actually carry. OpenJarvis runs on laptops, workstations, and servers — users interact via their phones.

Currently supported:

iMessage + SMS via SendBlue — bidirectional, auto-detects iMessage vs SMS, thread replies, progress updates
Slack via Socket Mode (slack-bolt) — bidirectional DMs, thread replies, Slack formatting, progress updates
Desktop/Browser — Interact tab with real-time streaming, tool progress, telemetry footer

Where you can help¶

Item	Maturity	Details
WhatsApp via Meta Cloud API	Design Needed	Baileys protocol is blocked by WhatsApp (405 errors). Need to implement via the official Meta WhatsApp Business API. Requires Meta Business account registration.
WhatsApp via Baileys (workaround)	Blocked	WhatsApp is actively blocking unofficial Baileys connections (405 Method Not Allowed). Monitor the Baileys repo for protocol updates.
Slack rich messages (Block Kit)	Ready	Current Slack responses use mrkdwn formatting. Add Block Kit support for structured responses with buttons, sections, and attachments. Good first issue.
Unified notification system	Design Needed	Push notifications when operators complete tasks or need user attention. Requires per-channel notification adapters.
Signal bidirectional	Design Needed	Currently send-only via signal-cli REST API. Add incoming message listener with background polling.
Voice interface	Research-Stage	Speech-to-text (Whisper) → agent → text-to-speech loop over phone channels. Existing `speech/` module provides a foundation.
Auto-restore channels on restart	Ready	Slack daemon and SendBlue auto-restore from saved bindings on server restart. Need to make this more robust for edge cases.

Workstream 3: Secure Cloud Collaboration¶

Personal AI's core tension: local models preserve privacy but lack capability; cloud models are powerful but require trusting a provider with your data. This workstream resolves that through Minions-style collaborative inference (local handles context, cloud handles reasoning) and TEE-based confidential computing (cloud cannot see your data even during inference).

References:

Where you can help¶

Item	Maturity	Details
Query complexity analyzer	Ready	Classify incoming queries by difficulty to decide local vs. cloud routing. Extends the existing `MultiEngine` routing logic.
Cost tracking per-query	Ready	`CloudEngine` already has pricing data. Surface per-query cost in traces and telemetry dashboards. Good first issue.
Redaction-before-cloud pipeline	Ready	Wire the existing `GuardrailsEngine` in REDACT mode as a mandatory pre-step before any cloud transmission.
Minion protocol (sequential)	Design Needed	Local model extracts and summarizes long context → cloud model reasons over the compressed result. Native reimplementation of the core Minions idea.
Minion protocol (parallel)	Design Needed	Local and cloud models work simultaneously on different aspects of a query; results are merged. Requires a new `HybridInferenceEngine` abstraction.
TEE attestation verification	Design Needed	Verify that cloud inference ran inside a trusted execution environment via cryptographic attestation.
Taint tracking across local/cloud boundary	Design Needed	The `TaintSet` already tracks PII/Secret labels. Add routing enforcement so tainted data only routes to attested TEE endpoints.
Speculative decoding (local draft + cloud verify)	Research-Stage	Local model generates candidate tokens; cloud model validates in parallel for latency reduction.

Workstream 4: Tutorials & Documentation¶

OpenJarvis has reference docs and four tutorials, but critical gaps remain in continuous agents, LM evaluation, learning approaches, and custom tools. Video tutorials are scoped as a contributor opportunity — written tutorials come first, with video scripts included so anyone can record.

Where you can help¶

Item	Maturity	Details
"Building Continuous Agents" tutorial	Ready	Writing an operator TOML manifest, activating it, session persistence across ticks, daemon mode. Example: a research operator that monitors arxiv daily.
"Adding Custom Tools" tutorial	Ready	Implementing `BaseTool`, registering via `ToolRegistry`, wiring into agents. Example: a weather API tool. Good first issue.
"Testing & Comparing LMs" tutorial	Ready	Running benchmarks, comparing local vs. cloud models, interpreting telemetry (latency, cost, energy per token). Uses the existing `bench/` framework.
Per-platform installation guides	Ready	Expand `installation.md` with platform-specific walkthroughs: macOS + Ollama, Ubuntu + NVIDIA + vLLM, Windows + Ollama, Raspberry Pi. Good first issue.
"Learning & Model Selection" tutorial	Design Needed	Router policies (heuristic, learned, GRPO), proposed approaches like Thompson Sampling, trace-based reward signals.
Video tutorial infrastructure	Design Needed	Establish recording workflow, hosting (YouTube), MkDocs embedding. Write video scripts alongside written tutorials.
Interactive Jupyter notebook tutorials	Design Needed	Notebook versions of key tutorials for exploratory, cell-by-cell learning.

Workstream 5: Hardware Breadth¶

Personal AI means running on the hardware people actually own. Each new hardware target expands who can use OpenJarvis and generates data for the research agenda (energy, cost, latency tradeoffs across silicon).

Adding a new hardware target involves up to four components: hardware detection in core/config.py, an inference engine adapter in engine/, an energy monitor in telemetry/, and an entry in the GPU specs database in telemetry/gpu_monitor.py.

Where you can help¶

Item	Maturity	Details
AMD Ryzen AI iGPU path	Ready	Strix Point RDNA 3.5 iGPU handles 7-8B via Vulkan. llama.cpp Vulkan backend works today. Needs hardware detection and energy monitor. Good first issue.
GPU specs database expansion	Ready	Add Intel Arc, Jetson Orin, Snapdragon specs to `GPU_SPECS` in `telemetry/gpu_monitor.py` (TFLOPS, bandwidth, TDP). Good first issue.
Intel Arc GPU (B580/B570)	Design Needed	12GB VRAM, ~$250 consumer GPU. Viable for 7-8B models. Engine path: IPEX-LLM or llama.cpp SYCL backend.
NVIDIA Jetson Orin	Design Needed	Best-in-class edge device. Orin NX 16GB handles 7-8B models at 15-25 tok/s. Needs hardware detection, energy monitor (tegrastats), deployment guide.
Qualcomm Snapdragon X Elite NPU	Design Needed	45 TOPS, Windows Arm laptops. ONNX Runtime + QNN Execution Provider is the viable path.
Intel Lunar Lake NPU via OpenVINO	Design Needed	48 TOPS — most mature NPU software stack for x86 laptops. New engine wrapping OpenVINO GenAI.
Raspberry Pi 5	Design Needed	CPU-only via llama.cpp ARM NEON for 1-3B models. $100 entry point for hobbyists.
Unified hardware benchmark suite	Design Needed	Standardized benchmark that runs the same workloads across all supported hardware, producing comparable energy/latency/throughput/cost numbers.