redaction
redaction
¶
PII redaction for analytics property values.
This is the value-level half of the guardrail
(:mod:openjarvis.analytics.events is the structural half).
For each property value we ship, we:
1. Drop strings longer than MAX_STR_LEN (no chunks of chat content).
2. Drop strings that match any known PII pattern (emails, IPs, MACs,
$HOME paths, API keys, JWTs, bearer tokens, etc.).
3. Otherwise pass through unchanged.
Fail-closed: when in doubt, drop. Combined with the event-spec allowlist this gives two independent layers of protection.
Functions¶
looks_like_pii
¶
Return True if any PII pattern matches anywhere in s.
redact
¶
Return a copy of properties with PII-bearing string values dropped.
Non-string values pass through unchanged. Strings exceeding
MAX_STR_LEN are dropped. Strings matching any PII pattern are
dropped. The event-spec validator (in :mod:events) runs after
this and provides a second layer of structural enforcement.
Source code in src/openjarvis/analytics/redaction.py
hash_id
¶
Return a 16-char sha256 prefix of s.
Used for model / tool / connector names that aren't on the public allowlist — we still want to see "uses-a-custom-model-X" cohorting without ever learning which model.