ARCA.VISION
// FEATURE — THE PERSONA SWITCHBOARD · PHASE 6

Intelligence is hot-swappable.

Why settle for one-size-fits-all security? The same proprietary eBPF engine that runs the Exfiltration Gate now hot-swaps its on-host SLM at the ioctl boundary — HIPAA, LATAM Health, DoD, Sovereign AI, Robotics, PII, Code Sentinel, or your own LoRA. Local SLM Governance with sub-2 ms context-aware latency. Zero downtime. Air-gap delivered.

SECTION 01RACK · 8 BAYS · HOT-SWAP

The rack-mount.

Eight cartridges. One active at a time. Tap a bay to hot-swap the persona that scores every suspect ioctl on the host.

Most AI security tools ship a single, frozen detection model. Arca.Vision treats the SLM as a cartridge: the kernel boundary stays put, the on-host inference gate stays put, and the persona — the thing that knows what your sector counts as a leak — slides in or out. PHI signals are not PCI signals are not weapons-systems CUI. The persona changes; the engine does not.

$arca persona swap --bay 03 --to dod-sentinel
Bays
8 · one active at a time
Swap path
pointer flip · no probe detach
Downtime
0 ms · ring buffers stay armed
Delivery
signed bundle · air-gap safe
// RACK · MAIN BAY·ACTIVE HIPAA Guardian·p99 1.2 ms
FORMATS · GGUF · ExLlamaV2 · LoRA▸ TAP A BAY TO HOT-SWAP
Active persona cartridge

// interactive · tap a bay to degauss-swap the active cartridge · LED indicates load state

PERSONA LATENCY · p99 (ms)BUDGET 2.0 ms0.51.01.52.02.5HIPAA · 1.2 msHEADROOM40%UNDER 2.0 ms

// all 8 cartridges plotted against the 2.0 ms context-aware budget · live cursor sweeps the gauge

SECTION 02SUB-2 ms · p99

Context-aware latency.

Even with complex domain personas, the gate decision stays under 2 ms p99.

Every cartridge ships as a Q4_K_M GGUF (or a LoRA over a shared base model) sized to fit a single sys_enter_ioctl budget. Stage 1 — the kernel-side heuristic — runs in sub-microsecond time and only forwards qualifying ioctls to the persona. Stage 2 — the SLM verdict — is greedy-decoded on host, with the result back in the gate’s decision loop before the next ring-buffer slot lands.

// SECTION 03 · UNDER THE HOOD

Three things that make
hot-swap actually safe.

The Persona Switchboard isn't model-routing in userspace. It's a kernel-adjacent inference gate orchestrated by our Rust control plane, with strict memory isolation between observer and observed.

PROOF 01

Aya-orchestrated load · zero downtime

SLM weights are orchestrated by our Rust-based control plane and served to the local inference gate over a loopback channel. Hot-swap is a pointer flip, not a process restart — the eBPF probes never detach. We've validated swap latency below the 2 ms context-aware budget across all eight cartridges.

PROOF 02

Strict memory isolation · observer ≠ observed

Each persona runs in a strictly isolated memory space, separated from the workload the Sentry is governing. The observer cannot be compromised by the observed: the SLM lives behind the same kernel boundary as the rest of the gate, and a compromised CUDA process has no read/write path to the persona weights.

PROOF 03

Model agnostic · GGUF, ExLlamaV2, LoRA

Cartridges ship as signed bundles. Out of the box we support llama-cpp-2 (GGUF, the default), ExLlamaV2 for higher-throughput personas, and LoRA adapters that reuse a base model already loaded on the host. All inference is local — no cloud round-trip, air-gap deployments are first-class.

// SECTION 04 · CARTRIDGE CATALOG

Eight cartridges.
One eBPF kernel.

One cartridge per regulated sector — plus an empty slot you can fill with your own LoRA. Click a cartridge to read the use case.

STANDBYHIPAA
HIPAA GuardianUS Healthcare · PHI
Model
Phi-3 mini · Q4_K_M
p99 latency
1.2 ms
HIPAA §164.312HITECH §1340242 CFR Part 2FDA · SaMD

Tuned for PHI patterns, ICD-10 codes, NPI numbers, and EHR exfil signatures. Catches PHI smuggling at the cudaMemcpyDeviceToHost boundary before the bytes leave VRAM.

FORMAT · GGUFVIEW →
STANDBYLATAM HEALTH
LATAM Health SentinelLATAM Healthcare · es-MX / pt-BR
Model
Phi-3 mini · bilingual
p99 latency
1.4 ms
NOM-024-SSA3ANPD · LGPDGDPR · Art. 9Ley 26.529

Bilingual PHI scoring with Mexican CURP, Brazilian CPF, and LATAM clinical record templates baked into the prompt. Air-gap-deployed across MX, BR, AR, and CO health networks.

FORMAT · GGUFVIEW →
STANDBYDoD
DoD SentinelDefense · IL5 / IL6
Model
Custom LoRA · Phi-3 base
p99 latency
1.6 ms
FedRAMP HighDoD IL5 / IL6NIST SP 800-171CMMC L3

Air-gapped LoRA fine-tuned on classified-handling guidance. Recognizes CUI markings, weapons-system telemetry, and cleared-contractor egress patterns. Signed by Arca engineering.

FORMAT · LoRAVIEW →
STANDBYSOVEREIGN
Sovereign AI WardenGovernment · Civilian
Model
Phi-3 mini · Q4_K_M
p99 latency
1.3 ms
FedRAMP ModFISMAeIDASISO 27001+1

Sovereign-cloud aware. Watches citizen PII patterns across SSA, IRS, DMV, and EU eIDAS attestation chains. Designed for federal civilian and StateRAMP deployments.

FORMAT · GGUFVIEW →
STANDBYROBOTICS
Robotics Safety OfficerAutonomous · Safety-Critical
Model
TinyLlama 1.1B · Q4
p99 latency
0.9 ms
ISO 26262 · ASIL-DIEC 61508DO-178CUN R155+1

Functional-safety persona. Flags actuator-bound ioctls, watchdog skips, and CAN-bus exfil patterns. Sub-millisecond decision budget — the safety loop cannot wait.

FORMAT · GGUFVIEW →
STANDBYPII
PII RedactorFinance · General PII
Model
Phi-3 mini · Q4_K_M
p99 latency
1.1 ms
PCI-DSSGDPRCCPANYDFS · Part 500+1

Card numbers, SSN, IBAN, BIK, account routing, and global financial PII patterns. The default cartridge for trading desks, fintech, and any cluster touching customer data.

FORMAT · GGUFVIEW →
STANDBYCODE
Code SentinelIP · Trade Secrets
Model
DeepSeek-Coder · LoRA
p99 latency
1.5 ms
Trade Secrets ActDTSAEU TS DirectivePatent IP

Embedding-aware persona. Detects model-weight dumps, proprietary algorithm leakage, and source-code exfil — the failure mode that loses you the company, not the lawsuit.

FORMAT · LoRAVIEW →
EMPTYCUSTOM
Custom LoRABring your own
Model
Open · GGUF / ExLlamaV2 / LoRA
p99 latency
YOUR · POLICY

Empty cartridge slot. Co-design with our engineers using your audit logs, your regulator language, your domain. Signed bundle, air-gap delivered.

FORMAT · GGUF / ExLlamaV2 / LoRABUILD →
// SECTION 05 · DYNAMIC COMPLIANCE & SHIELDING

Two personas.
One multi-tenant cluster.

A scenario you can run today: a single GPU fleet serving regulated tenants under different rule sets, with a different persona enforced per namespace.

// SCENARIO

Dynamic Compliance & Shielding

A multi-tenant H100 cluster serves Healthcare workloads in Namespace-A and Finance workloads in Namespace-B from the same physical fleet.

The Sentry loads HIPAA Guardian on Namespace-A and PII Redactor on Namespace-B from the same kernel attach. Each tenant gets policy enforcement that understands intent — not just regex. Hallucinated data leaks die at the cudaMemcpyDeviceToHost boundary, before bytes ever leave VRAM.

// REGULATIONS COVERED
HIPAA §164.312HITECH §13402PCI-DSSGDPRCCPANYDFS Part 500
STEP 01

Namespace-A · HIPAA Guardian

PHI exfil scoring · HIPAA §164.312 mapping

STEP 02

Namespace-B · PII Redactor

PCI-DSS / GDPR / CCPA pattern scoring

STEP 03

Shared kernel · isolated personas

Two SLMs · one Sentry · zero cross-tenant leakage

STEP 04

Hot-swap on policy change

No agent restart · no Helm rollover · no downtime

// BUILD YOUR PERSONA

Pick a cartridge.
Or co-design a custom LoRA with us.

Every persona is signed and air-gap delivered. Custom cartridges are built with our engineering team using your audit logs and your regulator language — usually 4–6 weeks from kickoff to a signed bundle on the host.