Skip to content

System Architecture

Overview

Aspect Details
Purpose Edit-agnostic safety evaluation framework for ML model weight modifications.
Audience Developers extending InvarLock, operators debugging pipelines, security reviewers.
Core components CLI shells, Core/runtime policy layer, Guard chain, Reporting/artifact subsystem.
Design goals Torch-independent core, edit-agnostic guards, deterministic evaluation, explicit artifact contracts, full provenance.
Source of truth src/invarlock/core/*.py, src/invarlock/reporting/*.py, src/invarlock/runtime_provenance.py, src/invarlock/runtime_verify.py, src/invarlock/cli/commands/*.py, src/invarlock/cli/run_*.py, src/invarlock/guards/*.py.

See the Glossary for definitions of terms such as the canonical guard chain, policy digest, and measurement contract.

Contents

  1. Quick Reference
  2. High-Level Architecture
  3. Component Layers
  4. Pipeline Flow
  5. Guard Chain Architecture
  6. report Generation Flow
  7. Architecture Guardrails
  8. Key Design Decisions
  9. Module Dependencies
  10. Extension Points
  11. Related Documentation

Quick Reference

┌─────────────────────────────────────────────────────────────────────────────┐
│                        INVARLOCK SYSTEM OVERVIEW                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  USER INPUT                    PROCESSING                      OUTPUT       │
│  ─────────                     ──────────                      ──────       │
│                                                                             │
│  ┌──────────┐    ┌────────────────────────────────┐    ┌──────────────┐     │
│  │  Config  │───▶│            CLI LAYER           │───▶│    report    │     │
│  │  (YAML)  │    │ evaluate | verify | report ... │    │    (JSON)    │     │
│  └──────────┘    └───────────────┬────────────────┘    └──────────────┘     │
│                                  │                                          │
│  ┌──────────┐    ┌───────────────▼────────────────┐    ┌──────────────┐     │
│  │  Model   │───▶│          CORE RUNTIME          │───▶│    report    │     │
│  │  (HF ID) │    │   runner.py + adapters + edits │    │    (JSON)    │     │
│  └──────────┘    └───────────────┬────────────────┘    └──────────────┘     │
│                                  │                                          │
│  ┌──────────┐    ┌───────────────▼────────────────┐    ┌──────────────┐     │
│  │ Dataset  │───▶│          GUARD CHAIN           │───▶│    Events    │     │
│  │(provider)│    │ inv(pre)→spectral→rmt→var→post │    │    (JSONL)   │     │
│  └──────────┘    └────────────────────────────────┘    └──────────────┘     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

High-Level Architecture

InvarLock follows a layered architecture with clear separation of concerns:

┌─────────────────────────────────────────────────────────────────────────────┐
│                            CLI SHELL LAYER                                  │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐           │
│  │ evaluate │ │  verify  │ │  report  │ │  doctor  │ │ advanced │           │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘           │
│       │            │            │            │            │                 │
├───────┴────────────┴────────────┴────────────┴────────────┴─────────────────┤
│                     CORE POLICY / CONTRACT LAYER                            │
│  ┌─────────────────────────────────────────────────────────────────┐        │
│  │ evaluate_plan · report_inputs · doctor_findings                │        │
│  │ verify_contract · run_retry_policy · run_snapshot_contract     │        │
│  │ run_guard_overhead_policy · run_provenance_contract            │        │
│  └─────────────────────────────────────────────────────────────────┘        │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                       CORE RUNTIME / SERVICES                               │
│  ┌─────────────────────────────────────────────────────────────────┐        │
│  │ runner.py + runner_*                                          │        │
│  │  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐     │        │
│  │  │prepare │─▶│ guards │─▶│  edit  │─▶│ guards │─▶│  eval  │     │        │
│  │  │ model  │  │(before)│  │ apply  │  │(after) │  │ final  │     │        │
│  │  └────────┘  └────────┘  └────────┘  └────────┘  └────────┘     │        │
│  └─────────────────────────────────────────────────────────────────┘        │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                            GUARD / MODEL LAYER                              │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐             │
│  │ invariants │  │  spectral  │  │    rmt     │  │  variance  │             │
│  │ (integrity)│  │  (weights) │  │(activation)│  │   (A/B)    │             │
│  └────────────┘  └────────────┘  └────────────┘  └────────────┘             │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                          REPORTING / FILES LAYER                            │
│  ┌──────────────┐ ┌──────────────┐ ┌────────────┐ ┌────────────┐            │
│  │ report_make  │ │ report_files │ │   render   │ │  manifest  │            │
│  │ + console    │ │ + evidence   │ │  (MD/HTML) │ │   (JSON)   │            │
│  └──────────────┘ └──────────────┘ └────────────┘ └────────────┘            │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Component Layers

CLI Layer (src/invarlock/cli/)

Typer-based command shells providing user-facing entry points. The command modules should stay thin: parse arguments, call core/reporting owners, render output, and map failures to exit codes.

Shell support modules such as cli/config_execution.py, cli/run_execution.py, cli/run_config.py, cli/run_pairing.py, cli/run_overhead.py, and cli/run_artifacts.py belong to this boundary layer as well. They can perform CLI-facing adaptation and console/event rendering, but they must not become policy owners.

Command Purpose Primary Output
evaluate Compare baseline vs subject with pinned windows report JSON + MD
verify Validate report against schema and pairing Exit code + messages
report Render/compare reports and reports MD/HTML/JSON artifacts
doctor Environment diagnostics Health check output
advanced Maintenance workflows such as evidence packs, policy packs, plugins, and calibration Exit code + workflow-specific artifacts
version Emit package and schema version information Version string

Core Policy / Contracts (src/invarlock/core/, src/invarlock/reporting/)

Deterministic policy, artifact-contract, and report-verification owners shared by the CLI and non-CLI entrypoints.

Module Responsibility
evaluate_contract.py Baseline-report validation and emitted run-artifact contract enforcement for evaluate
evaluate_plan.py Evaluation result policy, degradation classification, and emitted outcome shaping
report_inputs.py Canonical report path resolution and JSON-object validation
doctor_findings.py Structured doctor findings and optional report cross-check analysis
verify_contract.py Structured report-verification service used by verify and evidence-pack flows
runtime_manifest_verify.py + runtime_provenance.py Authoritative runtime-manifest verification and runtime-provenance ownership for report verification
run_policy.py Shared run policy helpers such as split choice, PM thresholds, and overhead policy
run_retry_policy.py Retry-attempt summaries and retry state transitions
run_snapshot_contract.py + run_snapshot_policy.py Snapshot planning, restore behavior, and retry transitions
run_guard_overhead_policy.py Guard-overhead normalization, summary building, and report shaping
run_provenance_contract.py + run_report_contract.py Run provenance and run-report assembly contracts
run_report_payload_policy.py Deterministic payload shaping for context, metrics, guards, and flags

Runtime Provenance Verification Ownership

Runtime provenance uses a single verifier implementation:

  • core/runtime_manifest_verify.py is the authoritative verifier for runtime.manifest.json plus report-digest binding checks.
  • runtime_verify.py and cli/runtime_verify.py are the programmatic and CLI entrypoints for that verifier.
  • runtime_provenance.py calls the same verifier when invarlock verify enforces runtime provenance on container-backed reports.
  • Product behavior does not depend on finding an external verifier binary on PATH; verifier semantics are package-native and deterministic across installs.

Core Runtime (src/invarlock/core/)

Pipeline orchestration without direct torch imports (torch-independent coordination).

Module Responsibility
runner.py + runner_*.py Pipeline phases: prepare → guards → edit → eval → finalize
api.py Protocol definitions for ModelAdapter, ModelEdit, Guard
bootstrap.py BCa bootstrap CI computation for paired metrics
checkpoint.py Snapshot/restore primitives for retry loops
registry.py Plugin discovery and registration

Guard Layer (src/invarlock/guards/)

Four-guard pipeline for edit safety validation.

Guard Focus Key Metric
invariants Structural integrity, NaN/Inf checks validation.invariants_pass
spectral Weight matrix spectral norm stability κ-threshold violations
rmt Activation edge-risk via Random Matrix Theory ε-band compliance
variance Variance equalization with A/B gate Predictive gain

Reporting Layer (src/invarlock/reporting/)

Report generation, validation, persistence, and rendering.

Module Responsibility
report_schema.py Evaluation report schema and structural validation
report_validation.py Canonical validation-flag computation
report_make.py Public evaluation-report entrypoint that coordinates the split report-making owners
report_make_inputs.py Input normalization, baseline reference building, and build-section extraction
report_make_assembly.py Policy/provenance/guard assembly and report build-context composition
report_make_output.py Final evaluation-report shaping and output payload construction
report_bundle.py Evaluation-bundle persistence, manifest writing, and evidence attachment
report_contract.py Input loading and report-generation planning
report_console.py Console/report validation summary helpers used by CLI/reporting surfaces
report_summary.py Shared executive-summary/view-model derivation for reporting surfaces
render.py Markdown rendering for evaluation reports
html.py HTML export with styling
report_files.py Raw run-report JSON/Markdown/HTML persistence
evidence.py Evidence file normalization and attachment helpers
telemetry.py Performance metrics collection

Pipeline Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                        EVALUATION PIPELINE                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   PHASE 1: BASELINE RUN                                                     │
│   ─────────────────────                                                     │
│   ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐              │
│   │  Load    │───▶│ Evaluate │───▶│  Record  │───▶│  Save    │              │
│   │  Model   │    │  Windows │    │  Guards  │    │  Report  │              │
│   └──────────┘    └──────────┘    └──────────┘    └──────────┘              │
│                                                                             │
│   PHASE 2: SUBJECT RUN (with baseline window pinning)                       │
│   ───────────────────────────────────────────────                           │
│   ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐              │
│   │  Load    │───▶│  Apply   │───▶│ Evaluate │───▶│  Record  │              │
│   │  Model   │    │  Edit    │    │  Paired  │    │  Guards  │              │
│   └──────────┘    └──────────┘    └──────────┘    └──────────┘              │
│                                                                             │
│   PHASE 3: EVALUATION REPORT GENERATION                                     │
│   ──────────────────────────────────────                                    │
│   ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐            │
│   │ Normalize  │─▶│  Compare   │─▶│  Apply     │─▶│ Persist +  │            │
│   │ inputs     │  │  metrics   │  │  policy    │  │ render     │            │
│   └────────────┘  └────────────┘  └────────────┘  └────────────┘            │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Guard Chain Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                         GUARD CHAIN EXECUTION                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   CANONICAL ORDER: invariants → spectral → rmt → variance → invariants      │
│                                                                             │
│   ┌─────────────────────────────────────────────────────────────────┐       │
│   │                    BEFORE EDIT                                  │       │
│   │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │       │
│   │  │  INVARIANTS  │  │   SPECTRAL   │  │     RMT      │           │       │
│   │  │   prepare()  │  │   prepare()  │  │   prepare()  │           │       │
│   │  │  ──────────  │  │  ──────────  │  │  ──────────  │           │       │
│   │  │ • NaN check  │  │ • Baseline σ │  │ • Baseline ε │           │       │
│   │  │ • Shape check│  │ • Family caps│  │ • Activation │           │       │
│   │  │ • Tying check│  │ • z-scores   │  │ • Calibration│           │       │
│   │  └──────────────┘  └──────────────┘  └──────────────┘           │       │
│   └─────────────────────────────────────────────────────────────────┘       │
│                               │                                             │
│                               ▼                                             │
│   ┌─────────────────────────────────────────────────────────────────┐       │
│   │                      EDIT APPLIED                               │       │
│   │          (quant_rtn, noop, or external BYOE checkpoint)         │       │
│   └─────────────────────────────────────────────────────────────────┘       │
│                               │                                             │
│                               ▼                                             │
│   ┌─────────────────────────────────────────────────────────────────┐       │
│   │                     AFTER EDIT                                  │       │
│   │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐           │       │
│   │  │  INVARIANTS  │  │   SPECTRAL   │  │     RMT      │           │       │
│   │  │  validate()  │  │  validate()  │  │  validate()  │           │       │
│   │  │  ──────────  │  │  ──────────  │  │  ──────────  │           │       │
│   │  │ • Post-edit  │  │ • κ-check    │  │ • ε-band     │           │       │
│   │  │   integrity  │  │ • Caps count │  │   compliance │           │       │
│   │  │ • NaN detect │  │ • Stability  │  │ • Δ tracking │           │       │
│   │  └──────────────┘  └──────────────┘  └──────────────┘           │       │
│   │                                                                 │       │
│   │  ┌──────────────┐                                               │       │
│   │  │   VARIANCE   │  (A/B test: bare vs VE-enabled)               │       │
│   │  │  validate()  │                                               │       │
│   │  │  ──────────  │                                               │       │
│   │  │ • Gain check │                                               │       │
│   │  │ • CI overlap │                                               │       │
│   │  │ • Enable/skip│                                               │       │
│   │  └──────────────┘                                               │       │
│   └─────────────────────────────────────────────────────────────────┘       │
│                               │                                             │
│                               ▼                                             │
│   ┌─────────────────────────────────────────────────────────────────┐       │
│   │                    GUARD RESULTS                                │       │
│   │                                                                 │       │
│   │  • validation.invariants_pass: bool                             │       │
│   │  • validation.spectral_stable: bool                             │       │
│   │  • validation.rmt_stable: bool                                  │       │
│   │  • measurement_contract_hash: str (CI/Release verification)     │       │
│   │                                                                 │       │
│   └─────────────────────────────────────────────────────────────────┘       │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Report Generation Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                         report GENERATION                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   INPUTS                                                                    │
│   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│   │  Baseline   │  │   Subject   │  │   Policy    │  │   Profile   │        │
│   │   report    │  │   report    │  │ (tiers.yaml)│  │ (ci/release)│        │
│   └──────┬──────┘  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘        │
│          └────────────────┴────────────────┴────────────────┘               │
│                                    │                                        │
│                                    ▼                                        │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                         report BUILDER                              │   │
│   │  1. Pair baseline/subject windows                                   │   │
│   │  2. Compute paired ΔlogNLL + BCa bootstrap                          │   │
│   │  3. Apply policy gates (PM ratio, drift, guard checks)              │   │
│   │  4. Emit validation flags + state                                   │   │
│   │  5. Attach provenance (seeds)                                       │   │
│   └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│   OUTPUTS                                                                   │
│   ┌────────────────────┐  ┌───────────────────┐  ┌────────────────────┐     │
│   │ evaluation.report  │  │ evaluation_report │  │  evaluation.html   │     │
│   │ .json              │  │ .md               │  │                    │     │
│   └────────────────────┘  └───────────────────┘  └────────────────────┘     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Architecture Guardrails

The shell/core split is enforced by design and by targeted architecture guard tests. The intended invariants are:

  • No lazy exports in package roots such as adapters/__init__.py or guards/__init__.py. Package roots should expose only explicit canonical exports.
  • No rmt_legacy references in production source. RMT ownership lives in rmt.py, rmt_analysis.py, rmt_detection.py, and rmt_math.py.
  • No dependency-map orchestration in command shells. Public command owners must stay thin and must not rebuild giant deps dictionaries or inject callables to recreate removed indirection.
  • No compatibility-only command signatures once a canonical owner contract exists. Example: lens-metric calculation takes a required MetricsConfig instead of deprecated per-call overrides.
  • No CLI imports inside owner layers. Modules under src/invarlock/core/ and src/invarlock/reporting/ must stay callable without importing invarlock.cli.

These guardrails keep the CLI as an imperative shell while policy, contracts, and verdict computation remain reusable from non-CLI flows such as evidence-pack verification and programmatic execution.

Key Design Decisions

Decision Rationale Implementation
Torch-independent core runner.py coordinates without importing torch; adapters encapsulate torch-specific logic. Adapter protocol in core/api.py
Edit-agnostic guards Guards work with any weight modification (quantization, pruning, LoRA merge). Guard protocol validates model state, not edit type
Tier-based policies Calibrated thresholds in tiers.yaml for balanced/conservative/aggressive safety profiles. Policy resolution in guards/policies.py
Deterministic evaluation Seed bundle + window pairing schedules ensure reproducible metrics. meta.seeds, dataset.windows.stats tracking
Functional-core / imperative-shell split Keep policy, artifact contracts, and verdict computation reusable outside the CLI while CLI modules stay thin. core/*.py + reporting/*.py owners called from cli/commands/*.py
Single verifier ownership Runtime-manifest verification should not vary with host tooling, so it must use one product implementation. core/runtime_manifest_verify.py, runtime_verify.py, runtime_provenance.py
Plugin architecture Entry points for guards, adapters, edits enable extension without core changes. importlib.metadata discovery in core/registry.py
Log-space primary metrics Paired ΔlogNLL with BCa bootstrap avoids ratio math bias. core/bootstrap.py implementation

Module Dependencies

┌─────────────────────────────────────────────────────────────────────────────┐
│                         MODULE DEPENDENCY GRAPH                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│                           ┌─────────────┐                                   │
│                           │     CLI     │                                   │
│                           │  commands/* │                                   │
│                           └──────┬──────┘                                   │
│                                  │                                          │
│                                  ▼                                          │
│                   ┌──────────────────────────────┐                           │
│                   │ cli shell support modules    │                           │
│                   │ run_config/run_pairing/      │                           │
│                   │ run_overhead/run_artifacts   │                           │
│                   └─────────────┬────────────────┘                           │
│                                 │                                            │
│                                 ▼                                            │
│                     ┌───────────────────────────┐                            │
│                     │ core/reporting contracts  │                            │
│                     │ evaluate_plan,            │                            │
│                     │ report_inputs,            │                            │
│                     │ doctor_findings,          │                            │
│                     │ verify_contract,          │                            │
│                     │ run_policy, run_retry,    │                            │
│                     │ run_snapshot, run_report  │                            │
│                     └─────────────┬─────────────┘                            │
│                                   │                                          │
│              ┌────────────────────┼────────────────────┐                     │
│              │                    │                    │                     │
│              ▼                    ▼                    ▼                     │
│       ┌─────────────┐     ┌─────────────┐     ┌─────────────┐                │
│       │ core/runner │────▶│  guards/*   │────▶│ reporting/* │                │
│       │  + services │     │             │     │ build/files │                │
│       └──────┬──────┘     └──────┬──────┘     └─────────────┘                │
│              │                   │                                           │
│              ▼                   ▼                                           │
│       ┌─────────────┐     ┌─────────────┐                                    │
│       │  adapters/  │     │   edits/    │                                    │
│       │   hf_*.py   │     │ quant_rtn.py│                                    │
│       └──────┬──────┘     └─────────────┘                                    │
│              │                                                               │
│              ▼                                                               │
│       ┌─────────────┐                                                        │
│       │    eval/    │  (metrics, datasets, tasks)                            │
│       │  *.py       │                                                        │
│       └─────────────┘                                                        │
│                                                                             │
│   KEY: ───▶ imports/depends on                                              │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Extension Points

InvarLock supports extension via entry points without modifying core code.

Extension Type Entry Point Group Example
Adapters invarlock.adapters hf_causal, hf_mlm, hf_causal
Guards invarlock.guards invariants, spectral, rmt, variance
Edits invarlock.edits quant_rtn, noop

Custom Adapter Example

# my_adapter.py
from invarlock.core.api import ModelAdapter

class MyAdapter(ModelAdapter):
    name = "my_custom_adapter"

    def load(self, model_id: str, device: str) -> nn.Module:
        # Custom loading logic
        ...

    def describe(self, model: nn.Module) -> dict:
        # Return model metadata
        ...
# pyproject.toml
[project.entry-points."invarlock.adapters"]
my_custom_adapter = "my_adapter:MyAdapter"

Troubleshooting

  • Import errors in torch-free context: ensure invarlock.core imports stay torch-independent; use adapters for torch operations.
  • Guard preparation failures: check tier policy compatibility; use context.run.strict_guard_prepare: false for debugging.
  • report generation errors: verify baseline and subject reports exist and have compatible window structures.

Observability

  • Pipeline phases emit timing via print_timing_summary() in CLI.
  • Guard results recorded in report.guards[] and report validation.* flags.
  • Telemetry fields include memory_mb_peak, latency_ms_*, duration_s.