System Architecture¶
Overview¶
| Aspect | Details |
|---|---|
| Purpose | Edit-agnostic safety evaluation framework for ML model weight modifications. |
| Audience | Developers extending InvarLock, operators debugging pipelines, security reviewers. |
| Core components | CLI shells, Core/runtime policy layer, Guard chain, Reporting/artifact subsystem. |
| Design goals | Torch-independent core, edit-agnostic guards, deterministic evaluation, explicit artifact contracts, full provenance. |
| Source of truth | src/invarlock/core/*.py, src/invarlock/reporting/*.py, src/invarlock/runtime_provenance.py, src/invarlock/runtime_verify.py, src/invarlock/cli/commands/*.py, src/invarlock/cli/run_*.py, src/invarlock/guards/*.py. |
See the Glossary for definitions of terms such as the canonical guard chain, policy digest, and measurement contract.
Contents¶
- Quick Reference
- High-Level Architecture
- Component Layers
- Pipeline Flow
- Guard Chain Architecture
- report Generation Flow
- Architecture Guardrails
- Key Design Decisions
- Module Dependencies
- Extension Points
- Related Documentation
Quick Reference¶
┌─────────────────────────────────────────────────────────────────────────────┐
│ INVARLOCK SYSTEM OVERVIEW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ USER INPUT PROCESSING OUTPUT │
│ ───────── ────────── ────── │
│ │
│ ┌──────────┐ ┌────────────────────────────────┐ ┌──────────────┐ │
│ │ Config │───▶│ CLI LAYER │───▶│ report │ │
│ │ (YAML) │ │ evaluate | verify | report ... │ │ (JSON) │ │
│ └──────────┘ └───────────────┬────────────────┘ └──────────────┘ │
│ │ │
│ ┌──────────┐ ┌───────────────▼────────────────┐ ┌──────────────┐ │
│ │ Model │───▶│ CORE RUNTIME │───▶│ report │ │
│ │ (HF ID) │ │ runner.py + adapters + edits │ │ (JSON) │ │
│ └──────────┘ └───────────────┬────────────────┘ └──────────────┘ │
│ │ │
│ ┌──────────┐ ┌───────────────▼────────────────┐ ┌──────────────┐ │
│ │ Dataset │───▶│ GUARD CHAIN │───▶│ Events │ │
│ │(provider)│ │ inv(pre)→spectral→rmt→var→post │ │ (JSONL) │ │
│ └──────────┘ └────────────────────────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
High-Level Architecture¶
InvarLock follows a layered architecture with clear separation of concerns:
┌─────────────────────────────────────────────────────────────────────────────┐
│ CLI SHELL LAYER │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ evaluate │ │ verify │ │ report │ │ doctor │ │ advanced │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │ │ │
├───────┴────────────┴────────────┴────────────┴────────────┴─────────────────┤
│ CORE POLICY / CONTRACT LAYER │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ evaluate_plan · report_inputs · doctor_findings │ │
│ │ verify_contract · run_retry_policy · run_snapshot_contract │ │
│ │ run_guard_overhead_policy · run_provenance_contract │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ CORE RUNTIME / SERVICES │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ runner.py + runner_* │ │
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │
│ │ │prepare │─▶│ guards │─▶│ edit │─▶│ guards │─▶│ eval │ │ │
│ │ │ model │ │(before)│ │ apply │ │(after) │ │ final │ │ │
│ │ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ GUARD / MODEL LAYER │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ invariants │ │ spectral │ │ rmt │ │ variance │ │
│ │ (integrity)│ │ (weights) │ │(activation)│ │ (A/B) │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ REPORTING / FILES LAYER │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ report_make │ │ report_files │ │ render │ │ manifest │ │
│ │ + console │ │ + evidence │ │ (MD/HTML) │ │ (JSON) │ │
│ └──────────────┘ └──────────────┘ └────────────┘ └────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Component Layers¶
CLI Layer (src/invarlock/cli/)¶
Typer-based command shells providing user-facing entry points. The command modules should stay thin: parse arguments, call core/reporting owners, render output, and map failures to exit codes.
Shell support modules such as cli/config_execution.py, cli/run_execution.py,
cli/run_config.py, cli/run_pairing.py, cli/run_overhead.py, and
cli/run_artifacts.py belong to this boundary layer as well. They can perform
CLI-facing adaptation and console/event rendering, but they must not become
policy owners.
| Command | Purpose | Primary Output |
|---|---|---|
evaluate |
Compare baseline vs subject with pinned windows | report JSON + MD |
verify |
Validate report against schema and pairing | Exit code + messages |
report |
Render/compare reports and reports | MD/HTML/JSON artifacts |
doctor |
Environment diagnostics | Health check output |
advanced |
Maintenance workflows such as evidence packs, policy packs, plugins, and calibration | Exit code + workflow-specific artifacts |
version |
Emit package and schema version information | Version string |
Core Policy / Contracts (src/invarlock/core/, src/invarlock/reporting/)¶
Deterministic policy, artifact-contract, and report-verification owners shared by the CLI and non-CLI entrypoints.
| Module | Responsibility |
|---|---|
evaluate_contract.py |
Baseline-report validation and emitted run-artifact contract enforcement for evaluate |
evaluate_plan.py |
Evaluation result policy, degradation classification, and emitted outcome shaping |
report_inputs.py |
Canonical report path resolution and JSON-object validation |
doctor_findings.py |
Structured doctor findings and optional report cross-check analysis |
verify_contract.py |
Structured report-verification service used by verify and evidence-pack flows |
runtime_manifest_verify.py + runtime_provenance.py |
Authoritative runtime-manifest verification and runtime-provenance ownership for report verification |
run_policy.py |
Shared run policy helpers such as split choice, PM thresholds, and overhead policy |
run_retry_policy.py |
Retry-attempt summaries and retry state transitions |
run_snapshot_contract.py + run_snapshot_policy.py |
Snapshot planning, restore behavior, and retry transitions |
run_guard_overhead_policy.py |
Guard-overhead normalization, summary building, and report shaping |
run_provenance_contract.py + run_report_contract.py |
Run provenance and run-report assembly contracts |
run_report_payload_policy.py |
Deterministic payload shaping for context, metrics, guards, and flags |
Runtime Provenance Verification Ownership¶
Runtime provenance uses a single verifier implementation:
core/runtime_manifest_verify.pyis the authoritative verifier forruntime.manifest.jsonplus report-digest binding checks.runtime_verify.pyandcli/runtime_verify.pyare the programmatic and CLI entrypoints for that verifier.runtime_provenance.pycalls the same verifier wheninvarlock verifyenforces runtime provenance on container-backed reports.- Product behavior does not depend on finding an external verifier binary on
PATH; verifier semantics are package-native and deterministic across installs.
Core Runtime (src/invarlock/core/)¶
Pipeline orchestration without direct torch imports (torch-independent coordination).
| Module | Responsibility |
|---|---|
runner.py + runner_*.py |
Pipeline phases: prepare → guards → edit → eval → finalize |
api.py |
Protocol definitions for ModelAdapter, ModelEdit, Guard |
bootstrap.py |
BCa bootstrap CI computation for paired metrics |
checkpoint.py |
Snapshot/restore primitives for retry loops |
registry.py |
Plugin discovery and registration |
Guard Layer (src/invarlock/guards/)¶
Four-guard pipeline for edit safety validation.
| Guard | Focus | Key Metric |
|---|---|---|
invariants |
Structural integrity, NaN/Inf checks | validation.invariants_pass |
spectral |
Weight matrix spectral norm stability | κ-threshold violations |
rmt |
Activation edge-risk via Random Matrix Theory | ε-band compliance |
variance |
Variance equalization with A/B gate | Predictive gain |
Reporting Layer (src/invarlock/reporting/)¶
Report generation, validation, persistence, and rendering.
| Module | Responsibility |
|---|---|
report_schema.py |
Evaluation report schema and structural validation |
report_validation.py |
Canonical validation-flag computation |
report_make.py |
Public evaluation-report entrypoint that coordinates the split report-making owners |
report_make_inputs.py |
Input normalization, baseline reference building, and build-section extraction |
report_make_assembly.py |
Policy/provenance/guard assembly and report build-context composition |
report_make_output.py |
Final evaluation-report shaping and output payload construction |
report_bundle.py |
Evaluation-bundle persistence, manifest writing, and evidence attachment |
report_contract.py |
Input loading and report-generation planning |
report_console.py |
Console/report validation summary helpers used by CLI/reporting surfaces |
report_summary.py |
Shared executive-summary/view-model derivation for reporting surfaces |
render.py |
Markdown rendering for evaluation reports |
html.py |
HTML export with styling |
report_files.py |
Raw run-report JSON/Markdown/HTML persistence |
evidence.py |
Evidence file normalization and attachment helpers |
telemetry.py |
Performance metrics collection |
Pipeline Flow¶
┌─────────────────────────────────────────────────────────────────────────────┐
│ EVALUATION PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ PHASE 1: BASELINE RUN │
│ ───────────────────── │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Load │───▶│ Evaluate │───▶│ Record │───▶│ Save │ │
│ │ Model │ │ Windows │ │ Guards │ │ Report │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ PHASE 2: SUBJECT RUN (with baseline window pinning) │
│ ─────────────────────────────────────────────── │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Load │───▶│ Apply │───▶│ Evaluate │───▶│ Record │ │
│ │ Model │ │ Edit │ │ Paired │ │ Guards │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ PHASE 3: EVALUATION REPORT GENERATION │
│ ────────────────────────────────────── │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Normalize │─▶│ Compare │─▶│ Apply │─▶│ Persist + │ │
│ │ inputs │ │ metrics │ │ policy │ │ render │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Guard Chain Architecture¶
┌─────────────────────────────────────────────────────────────────────────────┐
│ GUARD CHAIN EXECUTION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ CANONICAL ORDER: invariants → spectral → rmt → variance → invariants │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ BEFORE EDIT │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ INVARIANTS │ │ SPECTRAL │ │ RMT │ │ │
│ │ │ prepare() │ │ prepare() │ │ prepare() │ │ │
│ │ │ ────────── │ │ ────────── │ │ ────────── │ │ │
│ │ │ • NaN check │ │ • Baseline σ │ │ • Baseline ε │ │ │
│ │ │ • Shape check│ │ • Family caps│ │ • Activation │ │ │
│ │ │ • Tying check│ │ • z-scores │ │ • Calibration│ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ EDIT APPLIED │ │
│ │ (quant_rtn, noop, or external BYOE checkpoint) │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ AFTER EDIT │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ INVARIANTS │ │ SPECTRAL │ │ RMT │ │ │
│ │ │ validate() │ │ validate() │ │ validate() │ │ │
│ │ │ ────────── │ │ ────────── │ │ ────────── │ │ │
│ │ │ • Post-edit │ │ • κ-check │ │ • ε-band │ │ │
│ │ │ integrity │ │ • Caps count │ │ compliance │ │ │
│ │ │ • NaN detect │ │ • Stability │ │ • Δ tracking │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ │ ┌──────────────┐ │ │
│ │ │ VARIANCE │ (A/B test: bare vs VE-enabled) │ │
│ │ │ validate() │ │ │
│ │ │ ────────── │ │ │
│ │ │ • Gain check │ │ │
│ │ │ • CI overlap │ │ │
│ │ │ • Enable/skip│ │ │
│ │ └──────────────┘ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ GUARD RESULTS │ │
│ │ │ │
│ │ • validation.invariants_pass: bool │ │
│ │ • validation.spectral_stable: bool │ │
│ │ • validation.rmt_stable: bool │ │
│ │ • measurement_contract_hash: str (CI/Release verification) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Report Generation Flow¶
┌─────────────────────────────────────────────────────────────────────────────┐
│ report GENERATION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ INPUTS │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Baseline │ │ Subject │ │ Policy │ │ Profile │ │
│ │ report │ │ report │ │ (tiers.yaml)│ │ (ci/release)│ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ └────────────────┴────────────────┴────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ report BUILDER │ │
│ │ 1. Pair baseline/subject windows │ │
│ │ 2. Compute paired ΔlogNLL + BCa bootstrap │ │
│ │ 3. Apply policy gates (PM ratio, drift, guard checks) │ │
│ │ 4. Emit validation flags + state │ │
│ │ 5. Attach provenance (seeds) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ OUTPUTS │
│ ┌────────────────────┐ ┌───────────────────┐ ┌────────────────────┐ │
│ │ evaluation.report │ │ evaluation_report │ │ evaluation.html │ │
│ │ .json │ │ .md │ │ │ │
│ └────────────────────┘ └───────────────────┘ └────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Architecture Guardrails¶
The shell/core split is enforced by design and by targeted architecture guard tests. The intended invariants are:
- No lazy exports in package roots such as
adapters/__init__.pyorguards/__init__.py. Package roots should expose only explicit canonical exports. - No
rmt_legacyreferences in production source. RMT ownership lives inrmt.py,rmt_analysis.py,rmt_detection.py, andrmt_math.py. - No dependency-map orchestration in command shells. Public command owners must
stay thin and must not rebuild giant
depsdictionaries or inject callables to recreate removed indirection. - No compatibility-only command signatures once a canonical owner contract
exists. Example: lens-metric calculation takes a required
MetricsConfiginstead of deprecated per-call overrides. - No CLI imports inside owner layers. Modules under
src/invarlock/core/andsrc/invarlock/reporting/must stay callable without importinginvarlock.cli.
These guardrails keep the CLI as an imperative shell while policy, contracts, and verdict computation remain reusable from non-CLI flows such as evidence-pack verification and programmatic execution.
Key Design Decisions¶
| Decision | Rationale | Implementation |
|---|---|---|
| Torch-independent core | runner.py coordinates without importing torch; adapters encapsulate torch-specific logic. |
Adapter protocol in core/api.py |
| Edit-agnostic guards | Guards work with any weight modification (quantization, pruning, LoRA merge). | Guard protocol validates model state, not edit type |
| Tier-based policies | Calibrated thresholds in tiers.yaml for balanced/conservative/aggressive safety profiles. |
Policy resolution in guards/policies.py |
| Deterministic evaluation | Seed bundle + window pairing schedules ensure reproducible metrics. | meta.seeds, dataset.windows.stats tracking |
| Functional-core / imperative-shell split | Keep policy, artifact contracts, and verdict computation reusable outside the CLI while CLI modules stay thin. | core/*.py + reporting/*.py owners called from cli/commands/*.py |
| Single verifier ownership | Runtime-manifest verification should not vary with host tooling, so it must use one product implementation. | core/runtime_manifest_verify.py, runtime_verify.py, runtime_provenance.py |
| Plugin architecture | Entry points for guards, adapters, edits enable extension without core changes. | importlib.metadata discovery in core/registry.py |
| Log-space primary metrics | Paired ΔlogNLL with BCa bootstrap avoids ratio math bias. | core/bootstrap.py implementation |
Module Dependencies¶
┌─────────────────────────────────────────────────────────────────────────────┐
│ MODULE DEPENDENCY GRAPH │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ │
│ │ CLI │ │
│ │ commands/* │ │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────┐ │
│ │ cli shell support modules │ │
│ │ run_config/run_pairing/ │ │
│ │ run_overhead/run_artifacts │ │
│ └─────────────┬────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────┐ │
│ │ core/reporting contracts │ │
│ │ evaluate_plan, │ │
│ │ report_inputs, │ │
│ │ doctor_findings, │ │
│ │ verify_contract, │ │
│ │ run_policy, run_retry, │ │
│ │ run_snapshot, run_report │ │
│ └─────────────┬─────────────┘ │
│ │ │
│ ┌────────────────────┼────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ core/runner │────▶│ guards/* │────▶│ reporting/* │ │
│ │ + services │ │ │ │ build/files │ │
│ └──────┬──────┘ └──────┬──────┘ └─────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ adapters/ │ │ edits/ │ │
│ │ hf_*.py │ │ quant_rtn.py│ │
│ └──────┬──────┘ └─────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ eval/ │ (metrics, datasets, tasks) │
│ │ *.py │ │
│ └─────────────┘ │
│ │
│ KEY: ───▶ imports/depends on │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Extension Points¶
InvarLock supports extension via entry points without modifying core code.
| Extension Type | Entry Point Group | Example |
|---|---|---|
| Adapters | invarlock.adapters |
hf_causal, hf_mlm, hf_causal |
| Guards | invarlock.guards |
invariants, spectral, rmt, variance |
| Edits | invarlock.edits |
quant_rtn, noop |
Custom Adapter Example¶
# my_adapter.py
from invarlock.core.api import ModelAdapter
class MyAdapter(ModelAdapter):
name = "my_custom_adapter"
def load(self, model_id: str, device: str) -> nn.Module:
# Custom loading logic
...
def describe(self, model: nn.Module) -> dict:
# Return model metadata
...
# pyproject.toml
[project.entry-points."invarlock.adapters"]
my_custom_adapter = "my_adapter:MyAdapter"
Troubleshooting¶
- Import errors in torch-free context: ensure
invarlock.coreimports stay torch-independent; use adapters for torch operations. - Guard preparation failures: check tier policy compatibility; use
context.run.strict_guard_prepare: falsefor debugging. - report generation errors: verify baseline and subject reports exist and have compatible window structures.
Observability¶
- Pipeline phases emit timing via
print_timing_summary()in CLI. - Guard results recorded in
report.guards[]and reportvalidation.*flags. - Telemetry fields include
memory_mb_peak,latency_ms_*,duration_s.
Related Documentation¶
- CLI Reference — Command usage and options
- Guards Reference — Guard configuration and evidence
- Configuration Schema — YAML config structure
- reports — report schema and verification
- Assurance Case Overview — Assurance claims and evidence