Example Reports¶

Overview¶

Aspect	Details
Purpose	Show how to generate and interpret InvarLock reports.
Audience	Users learning the evaluation workflow.
Outputs	`evaluation.report.json`, `evaluation_report.md`, `report.json`, and `runtime.manifest.json` for container-backed outputs.
Requires	`invarlock[hf]` for HF adapter workflows.

InvarLock emits both machine-readable reports and human-friendly summaries. Use the steps below to reproduce representative artifacts from this repository version.

Read The Bundle First¶

For most reviewers, the primary artifact is evaluation.report.json, not the lower-level run reports. Use it as the front door:

invarlock verify reports/quant8_demo/evaluation.report.json
invarlock report html -i reports/quant8_demo/evaluation.report.json -o reports/quant8_demo/evaluation.html
invarlock report explain --evaluation-report reports/quant8_demo/evaluation.report.json

Artifact model:

Artifact	What it contains	Typical next step
`evaluation.report.json`	Paired evaluation outcome, validation block, policy/provenance summary	`verify`, `report html`, `report explain --evaluation-report`
`report.json`	One run's raw metrics, guard telemetry, and execution artifacts	`report generate`, explicit `report explain --subject-report ... --baseline-report ...`

1. Generate a report Bundle¶

The command below shows the default runtime-container path. It writes a container-backed runtime.manifest.json next to evaluation.report.json. Public host-side workflows use --execution-mode host and should verify the resulting report with invarlock verify --runtime-provenance host .... This reproduction uses repo-owned preset and overlay files so it matches the example artifacts checked into this repository version; wheel-only installs should start with Getting Started for the first evaluation run, then come back here once they already have an evaluation bundle.

INVARLOCK_DEDUP_TEXTS=1 invarlock evaluate --allow-network \
  --baseline sshleifer/tiny-gpt2 \
  --subject  sshleifer/tiny-gpt2 \
  --adapter auto \
  --profile release \
  --tier balanced \
  --preset configs/presets/causal_lm/wikitext2_512.yaml \
  --edit-config configs/overlays/edits/quant_rtn/8bit_full.yaml \
  --out runs/quant8_demo \
  --report-out reports/quant8_demo

The command writes evaluation.report.json, evaluation_report.md, and runtime.manifest.json under reports/quant8_demo/. Each report contains:

Model and edit metadata (model id, adapter, commit hash, edit plan)
Drift / perplexity / RMT verdicts with paired bootstrap confidence intervals
Guard diagnostics (spectral, variance, invariants) including predictive-gate notes
Policy digest capturing tier thresholds and calibration choices

2. Create a Narrative Summary¶

# The report already includes a markdown summary:
cat reports/quant8_demo/evaluation_report.md

# To regenerate markdown from run reports, pass edited + baseline:
invarlock report generate \
  --run <edited_report.json> \
  --baseline-run-report <baseline_report.json> \
  --format markdown

The markdown report mirrors the report content but highlights:

Baseline vs edited perplexity series
Guard outcomes with links to supporting metrics
Checklist of gates (PASS/FAIL) suitable for change-control review

3. Shareable Attachments¶

HTML report chrome:

Header -> Summary chips -> Quick links rail -> Canonical report body

That layout is intentional: reviewers should be able to confirm overall status, jump directly to the gate or provenance section they care about, and still read the unchanged canonical report content underneath.

For audits, collect the following files:

File	Purpose
`runs/<name>/**/report.json`	Execution log, metrics, and guard telemetry
`reports/<name>/evaluation.report.json`	Machine-readable evaluation report
`reports/<name>/runtime.manifest.json`	Runtime provenance for container-backed outputs
`reports/<name>/evaluation_report.md`	Human-friendly summary for reviewers

Reports remain valid only for the same baseline reference, pairing assumptions, dataset/tokenizer context, and scoped claim surface, and only while invarlock verify --json reports/<name>/evaluation.report.json continues to pass against the adjacent runtime.manifest.json.