Security¶

OpenJarvis includes a security layer that scans prompts and model outputs for secrets, personally identifiable information (PII), and sensitive file paths. The system is designed to be composable: scanners run as a pipeline, and the GuardrailsEngine wrapper drops in front of any inference backend without changing how the rest of your code works.

Overview¶

The security module has four independently usable components:

GuardrailsEngine

Wraps any InferenceEngine with pre- and post-call scanning. Supports WARN, REDACT, and BLOCK modes.

Jump to GuardrailsEngine
SecretScanner

Detects API keys, tokens, passwords, and connection strings in text.

Jump to SecretScanner
PIIScanner

Detects email addresses, SSNs, credit card numbers, phone numbers, and public IPs.

Jump to PIIScanner
File Policy

Blocks access to .env, *.pem, id_rsa, and other credential files.

Jump to File Policy

GuardrailsEngine¶

GuardrailsEngine wraps any InferenceEngine and scans both the input messages and the output content. It is not registered in EngineRegistry — you create it directly by wrapping an existing engine instance.

Modes¶

Mode	Constant	Behavior
Warn	`RedactionMode.WARN`	Publish a `SECURITY_ALERT` event but pass the text through unchanged. Default.
Redact	`RedactionMode.REDACT`	Replace matches with `[REDACTED:pattern_name]` before passing to/from the engine.
Block	`RedactionMode.BLOCK`	Raise `SecurityBlockError` immediately when findings are detected.

Basic Usage¶

Warn mode (default)Redact modeBlock mode

warn_mode.py

from openjarvis.engine.ollama import OllamaEngine
from openjarvis.security.guardrails import GuardrailsEngine
from openjarvis.security.types import RedactionMode
from openjarvis.core.types import Message, Role

engine = OllamaEngine()
guarded = GuardrailsEngine(engine)  # (1)!

messages = [Message(role=Role.USER, content="My API key is sk-abc123xyz")]
response = guarded.generate(messages, model="qwen3:8b")
# The key is logged as a warning but the text is passed unchanged
print(response["content"])

Defaults to mode=RedactionMode.WARN, scan_input=True, scan_output=True.

redact_mode.py

from openjarvis.engine.ollama import OllamaEngine
from openjarvis.security.guardrails import GuardrailsEngine
from openjarvis.security.types import RedactionMode
from openjarvis.core.types import Message, Role

engine = OllamaEngine()
guarded = GuardrailsEngine(engine, mode=RedactionMode.REDACT)  # (1)!

messages = [Message(role=Role.USER, content="My key is sk-abc123xyz, help me debug")]
response = guarded.generate(messages, model="qwen3:8b")
# Input sent to engine: "My key is [REDACTED:openai_key], help me debug"

Sensitive patterns in input messages are replaced before reaching the model.

block_mode.py

from openjarvis.engine.ollama import OllamaEngine
from openjarvis.security.guardrails import GuardrailsEngine, SecurityBlockError
from openjarvis.security.types import RedactionMode
from openjarvis.core.types import Message, Role

engine = OllamaEngine()
guarded = GuardrailsEngine(engine, mode=RedactionMode.BLOCK)

try:
    messages = [Message(role=Role.USER, content="AKIA1234567890ABCDEF")]
    guarded.generate(messages, model="qwen3:8b")
except SecurityBlockError as exc:
    print(f"Blocked: {exc}")
    # Blocked: Security scan blocked input: 1 finding(s) detected

Constructor Parameters¶

Parameter	Type	Default	Description
`engine`	`InferenceEngine`	—	The wrapped inference engine
`scanners`	`list[BaseScanner]`	`[SecretScanner(), PIIScanner()]`	Scanners to run
`mode`	`RedactionMode`	`WARN`	Action on findings
`scan_input`	`bool`	`True`	Scan input messages
`scan_output`	`bool`	`True`	Scan output content
`bus`	`EventBus`	`None`	Event bus for security events

Event Bus Integration¶

When a bus is provided, GuardrailsEngine publishes events on every scan result:

Event	When
`SECURITY_ALERT`	Findings detected in WARN or REDACT mode
`SECURITY_BLOCK`	Findings detected in BLOCK mode

You can subscribe to these events with an AuditLogger to build a persistent security event log. See Audit Logger below.

Custom Scanners¶

You can pass any set of BaseScanner subclasses to restrict or extend scanning:

custom_scanners.py

from openjarvis.security.guardrails import GuardrailsEngine
from openjarvis.security.scanner import SecretScanner
from openjarvis.security.types import RedactionMode

# Only scan for secrets, skip PII
guarded = GuardrailsEngine(
    engine,
    scanners=[SecretScanner()],
    mode=RedactionMode.REDACT,
)

Streaming¶

For streaming calls, GuardrailsEngine.stream() yields tokens in real time and then performs a post-hoc scan on the accumulated output for logging. Because tokens are already delivered to the caller before scanning completes, BLOCK mode only applies to the input side during streaming.

Streaming and BLOCK mode

SecurityBlockError can only be raised before the stream starts (for input scanning). Output blocking during streaming is not possible — use REDACT mode if you need to sanitize model outputs in streaming scenarios.

SecretScanner¶

SecretScanner detects API keys, tokens, passwords, and other credentials using regex patterns. Each pattern has an associated ThreatLevel.

Pattern Reference¶

Pattern Name	Threat Level	Matches
`openai_key`	CRITICAL	`sk-` followed by 20+ alphanumeric chars
`anthropic_key`	CRITICAL	`sk-ant-` followed by 20+ chars
`aws_access_key`	CRITICAL	`AKIA` followed by 16 uppercase alphanumeric chars
`github_token`	CRITICAL	`ghp_`, `gho_`, `ghs_`, `ghr_`, `github_pat_` followed by 36+ chars
`stripe_key`	CRITICAL	`sk_live_`, `sk_test_`, `pk_live_`, `pk_test_` followed by 20+ chars
`private_key`	CRITICAL	PEM private key header `-----BEGIN PRIVATE KEY-----`
`password_assignment`	HIGH	`password = "..."`, `passwd: "..."`, etc.
`db_connection_string`	HIGH	`postgres://`, `mysql://`, `mongodb://`, `redis://` URLs
`slack_token`	HIGH	`xoxb-`, `xoxp-`, `xoxo-`, `xoxr-`, `xoxs-` followed by token
`generic_api_key`	HIGH	`api_key = "..."`, `secret_key = "..."`, `auth_token = "..."`

Direct Usage¶

secret_scanner.py

from openjarvis.security.scanner import SecretScanner

scanner = SecretScanner()

# Scan text
result = scanner.scan("My key is sk-abc123xyz789 and it is secret")
print(result.clean)           # False
print(result.highest_threat)  # ThreatLevel.CRITICAL
for finding in result.findings:
    print(f"  {finding.pattern_name}: {finding.description} at [{finding.start}:{finding.end}]")

# Redact text
clean = scanner.redact("Token: sk-abc123xyz789")
print(clean)  # Token: [REDACTED:openai_key]

PIIScanner¶

PIIScanner detects personally identifiable information using regex patterns calibrated for common US formats.

Pattern Reference¶

Pattern Name	Threat Level	Matches
`us_ssn`	CRITICAL	`XXX-XX-XXXX` format Social Security Numbers
`credit_card_visa`	CRITICAL	Visa card numbers (16 digits starting with 4)
`credit_card_mastercard`	CRITICAL	Mastercard numbers (16 digits starting with 51–55)
`credit_card_amex`	CRITICAL	Amex numbers (15 digits starting with 34 or 37)
`email`	MEDIUM	Standard email addresses
`us_phone`	MEDIUM	US phone numbers in common formats
`ipv4_public`	LOW	Public IPv4 addresses (excludes RFC1918 ranges)

Private IP addresses

The ipv4_public pattern intentionally excludes private ranges (10.x.x.x, 172.16–31.x.x, 192.168.x.x, 127.x.x.x). Internal IP addresses are not considered sensitive by default.

Direct Usage¶

pii_scanner.py

from openjarvis.security.scanner import PIIScanner

scanner = PIIScanner()

text = "Contact john@example.com or call 555-867-5309"
result = scanner.scan(text)

for finding in result.findings:
    print(f"{finding.pattern_name}: threat={finding.threat_level.value}")
# email: threat=medium
# us_phone: threat=medium

clean = scanner.redact(text)
print(clean)
# Contact [REDACTED:email] or call [REDACTED:us_phone]

File Policy¶

The file policy module prevents access to credential and key files. It is used internally by FileReadTool and the memory ingest path, but you can use it directly.

Sensitive File Patterns¶

The DEFAULT_SENSITIVE_PATTERNS frozenset contains the following glob patterns:

Pattern	Description
`.env`, `.env.`, `.env`	Environment variable files
`.secret`, `*.secrets`	Generic secret files
`credentials.*`	Credential files
`.pem`, `.key`	TLS/SSL certificates and private keys
`.p12`, `.pfx`, `*.jks`	PKCS and Java keystore files
`id_rsa`, `id_ed25519`	SSH private key files
`.htpasswd`	Apache password files
`.pgpass`	PostgreSQL password files
`.netrc`	FTP/SSH credential files

Usage¶

file_policy.py

from pathlib import Path
from openjarvis.security.file_policy import is_sensitive_file, filter_sensitive_paths

# Check a single file
print(is_sensitive_file(".env"))           # True
print(is_sensitive_file("server.key"))     # True
print(is_sensitive_file("README.md"))      # False

# Filter a list of paths
paths = [
    Path("README.md"),
    Path(".env"),
    Path("src/main.py"),
    Path("server.pem"),
]
safe = filter_sensitive_paths(paths)
print(safe)  # [PosixPath('README.md'), PosixPath('src/main.py')]

Integration with FileReadTool¶

The built-in FileReadTool automatically calls is_sensitive_file() before reading any path. Attempts to read sensitive files raise an error rather than returning the file content. This behavior cannot be disabled at the tool level — configure the agent not to have FileReadTool if you need unrestricted file access.

Audit Logger¶

The AuditLogger persists security events to an append-only SQLite database. It can subscribe to the event bus to capture events automatically, or you can call log() manually.

Event Bus Integration (Automatic)¶

audit_bus.py

from openjarvis.core.events import EventBus
from openjarvis.security.audit import AuditLogger
from openjarvis.security.guardrails import GuardrailsEngine
from openjarvis.security.types import RedactionMode
from openjarvis.engine.ollama import OllamaEngine

bus = EventBus()

# AuditLogger subscribes to SECURITY_SCAN, SECURITY_ALERT, SECURITY_BLOCK
audit = AuditLogger(db_path="~/.openjarvis/audit.db", bus=bus)

engine = OllamaEngine()
guarded = GuardrailsEngine(
    engine,
    mode=RedactionMode.WARN,
    bus=bus,
)

# Security events are now persisted automatically

Manual Logging¶

audit_manual.py

import time
from openjarvis.security.audit import AuditLogger
from openjarvis.security.types import SecurityEvent, SecurityEventType

audit = AuditLogger(db_path="./audit.db")

event = SecurityEvent(
    event_type=SecurityEventType.SECRET_DETECTED,
    timestamp=time.time(),
    findings=[],
    content_preview="sk-...",
    action_taken="redacted",
)
audit.log(event)

Querying the Audit Log¶

audit_query.py

from openjarvis.security.audit import AuditLogger

audit = AuditLogger(db_path="~/.openjarvis/audit.db")

# Recent events
events = audit.query(limit=20)

# Filter by event type
secret_events = audit.query(event_type="secret_detected")

# Filter by time range
import time
recent = audit.query(since=time.time() - 3600)  # last hour

# Count total events
print(f"Total events: {audit.count()}")

audit.close()

Configuration¶

Security settings live in the [security] section of ~/.openjarvis/config.toml.

~/.openjarvis/config.toml

[security]
enabled = true
scan_input = true
scan_output = true
mode = "warn"               # "warn" | "redact" | "block"
secret_scanner = true
pii_scanner = true
audit_log_path = "~/.openjarvis/audit.db"
enforce_tool_confirmation = true

Configuration Reference¶

Key	Type	Default	Description
`enabled`	`bool`	`true`	Enable the security subsystem
`scan_input`	`bool`	`true`	Scan user input messages
`scan_output`	`bool`	`true`	Scan model output content
`mode`	`str`	`"warn"`	Action on findings: `warn`, `redact`, or `block`
`secret_scanner`	`bool`	`true`	Run `SecretScanner` on all text
`pii_scanner`	`bool`	`true`	Run `PIIScanner` on all text
`audit_log_path`	`str`	`~/.openjarvis/audit.db`	Path to the SQLite audit log
`enforce_tool_confirmation`	`bool`	`true`	Require explicit confirmation before tool execution

Start with warn, tighten later

mode = "warn" is a good starting point. It lets you observe what patterns are being triggered without disrupting normal usage. Switch to "redact" once you are satisfied that the scanner isn't producing too many false positives for your workload.

Writing a Custom Scanner¶

Implement BaseScanner and pass an instance to GuardrailsEngine:

custom_scanner.py

import re
from openjarvis.security._stubs import BaseScanner
from openjarvis.security.types import ScanFinding, ScanResult, ThreatLevel


class InternalUrlScanner(BaseScanner):
    """Detect internal service URLs that should not be shared externally."""

    scanner_id = "internal_urls"

    PATTERN = re.compile(r"https?://internal\.[a-z0-9.-]+\.[a-z]{2,}")

    def scan(self, text: str) -> ScanResult:
        findings = []
        for match in self.PATTERN.finditer(text):
            findings.append(ScanFinding(
                pattern_name="internal_url",
                matched_text=match.group(0),
                threat_level=ThreatLevel.MEDIUM,
                start=match.start(),
                end=match.end(),
                description="Internal service URL",
            ))
        return ScanResult(findings=findings)

    def redact(self, text: str) -> str:
        return self.PATTERN.sub("[REDACTED:internal_url]", text)


# Use with GuardrailsEngine
from openjarvis.security.guardrails import GuardrailsEngine
from openjarvis.security.types import RedactionMode

guarded = GuardrailsEngine(
    engine,
    scanners=[InternalUrlScanner()],
    mode=RedactionMode.REDACT,
)

Security¶

Overview¶

GuardrailsEngine¶

Modes¶

Basic Usage¶

Constructor Parameters¶

Event Bus Integration¶

Custom Scanners¶

Streaming¶

SecretScanner¶

Pattern Reference¶

Direct Usage¶

PIIScanner¶

Pattern Reference¶

Direct Usage¶

File Policy¶

Sensitive File Patterns¶

Usage¶

Integration with FileReadTool¶

Audit Logger¶

Event Bus Integration (Automatic)¶

Manual Logging¶

Querying the Audit Log¶

Configuration¶

Configuration Reference¶

Writing a Custom Scanner¶

See Also¶