From stamps to results: a five-step replay
Inputs (minimal).manifest.json, decisions.csv (stamped logs), and task gold labels. All numbers are ASCII decimals; classical outputs remain unchanged by design (phi((m,a)) = m).
Step 1 — Clamp & map (per decision / per signal).
Map signed contrasts to bounded alignments; clamp before atanh.
# lane clamps
a_in := tanh(-c * e_in)
a_out := tanh(+c * e_out)
# guards
a_in := clamp(a_in, -1+eps_a, +1-eps_a)
a_out := clamp(a_out, -1+eps_a, +1-eps_a)
# u-space
u_in := atanh(a_in)
u_out := atanh(a_out)
Step 2 — Chooser (RSI).
Fuse in u-space; order/shard invariance by construction.
U_in := sum_i w_i * u_in_i
V_out := sum_j w_j * u_out_j
W_in := sum_i w_i
RSI := tanh( (V_out - U_in) / max(W_in, eps_w) )
# zero-evidence guard
if W_in == 0: RSI := 0
Step 3 — Calm gate (alignment-only).
Apply gate to alignment only; m stays pristine.
# either simple multiply
RSI_env := g_t * RSI
# or curvature-preserving mode
RSI_env := tanh( g_t * atanh(RSI) )
Step 4 — Banding (A++/A+/A0/A-/A–).
Use fixed thresholds (with optional hysteresis); band on RSI_env if declared.
A++: x >= +0.90
A+ : +0.60 <= x < +0.90
A0 : -0.60 < x < +0.60
A- : -0.90 < x <= -0.60
A--: x <= -0.90
Step 5 — Record & aggregate.
Keep replay deterministic; compute quality/efficiency/stability vs gold.
# per decision (example fields; ASCII only)
ts, run_id, item_id, U, W, RSI, RSI_env, band, g_t,
lanes(F,D,L,E,V,Q), lens_id, Unit, c, eps_a, eps_w,
weights_policy, division_policy, dtype, knobs_hash,
file_sha256_in, file_sha256_out
# aggregates
- accuracy / F1 (task-level)
- retries, tokens, tool calls per solved task
- band histogram for RSI or RSI_env
- batch vs stream vs shuffled RSI difference (~0 within dtype eps)
- corr(RSI, correctness)
CSV column hints (compact).
For step-level experiments, a minimal long-form CSV works well:
item_id, e_in, e_out, w, g_t, band_prev
For doc-wise RAG pooling:
query_id, doc_id, e_out, e_in, w_doc, g_t, stamp
Verifier skeleton (pseudo).
load manifest.json
for each decision in decisions.csv:
compute a_in, a_out -> clamp
u_in := atanh(a_in); u_out := atanh(a_out)
RSI := tanh((sum w*u_out - sum w*u_in)/max(sum w, eps_w))
RSI_env := gate(RSI, g_t, mode)
band := thresholds(RSI_env)
aggregate metrics, print tables
QA invariants (must pass).
- Collapse parity:
phi((m,a)) = meverywhere. - Boundedness:
|a| < 1,|RSI| < 1,|RSI_env| < 1. - Determinism: same manifest + same CSV ⇒ identical outputs (within dtype tolerance).
- Zero-evidence: if
W_in == 0, thenRSI = 0,band = "A0"with reasoninsufficient_evidence. - Invariance: batch == stream == shuffled within numeric epsilon.
One-line takeaway. Replay is five lines of math: clamp → map to u → fuse → gate → band. Publish ASCII stamps and a fixed manifest so anyone can reproduce deltas while classical values remain identical via phi((m,a)) = m.
Navigation
Previous: SSM-AI – Empirical Validation & Mini Benchmarks —Results (6.4)
Next: SSM-AI – Scalability & Numerical Precision (7)
Directory of Pages
SSM-AI — Table of Contents