SSM-AI – Appendix E — Vendor Bake-off Protocol (E1–E3)

Fair, bounded, reproducible comparisons across vendors/models.

Purpose. Standardize cross-vendor/model evaluations using the same observation-only math and stamped ledgers. Classical numbers remain untouched (phi((m,a)) = m). Selection and reporting use bounded alignment and order-invariant fusion.

E1) Scope & prerequisites (freeze before you run)

Manifest freeze (non-negotiable). Fix: eps_a, eps_w, weights policy, combine_policy="M2", division policy, band thresholds, lens params (Unit, c), gate mode ("mul" or "u_scale"). Publish knobs_hash.
Traffic & sets. Choose exactly one: shadow_traffic (live mirror) or frozen_eval_set (static prompts/queries/docs).
Randomness. Fix decoding seeds; log seeds in stamps.
Stamping. Every decision emits a one-line stamp and a ledger row (per Appendix C).
Observation-only. No post-hoc calibration per vendor. Classical m is never altered inside SSM-AI (phi((m,a)) = m).

E2) What to log per decision (minimum row)

iso_utc, svc, req_id, item_id, RSI, w, g, RSI_env, band,
U_dec := atanh(RSI), W_dec := w, manifest := knobs_hash, seed, stamp
# Optional overlays: tokens, lat_ms, cost_unit, and any classical metric m your stack already emits

E3) How to aggregate per vendor (bounded, order-invariant)
Ungated pool (intrinsic capability).

U_pool := SUM U_dec
W_pool := SUM W_dec
RSI_pool := tanh( U_pool / max(W_pool, eps_w) )

Gated pool (live readiness). Apply the gate per decision, then fuse in u-space.

# Gate each decision
"mul"     : RSI_env := g * RSI
"u_scale" : RSI_env := tanh( g * atanh(RSI) )

# Then pool the gated decisions
U_env := SUM atanh(RSI_env)
RSI_pool_env := tanh( U_env / max(W_pool, eps_w) )

Band distribution. Count/fraction of A++/A+/A0/A-/A-- over RSI_env.
Cost/latency overlays. Report medians/means alongside RSI_pool_env (never mix them into the bounded index).

Notes (do these, avoid those).

Always pool in u-space (atanh) to preserve order/shard invariance.
Never average directly in a-space; only (U,W) may be merged across shards.
Collapse parity holds at every step: phi((m,a)) = m.

One-line takeaway. Freeze the manifest, stamp every decision, and compare vendors on the same bounded chooser: pool by (U,W) in u-space for intrinsic (RSI_pool) and gated readiness (RSI_pool_env)—order-invariant, shard-safe, reproducible, and m stays pristine.

Navigation
Previous: SSM-AI – Appendix D — Reference Pseudocode, Checklist & Manifest Keys (D4–D6)
Next: SSM-AI – Appendix E — Vendor Bake-off Protocol (E4–E6)

Directory of Pages
SSM-AI — Table of Contents