Bound one lane; plug any confidence cue; decide by RSI (values unchanged).
H4) Pros/cons by method (engineering quick view)
- Entropy / margins — Pros: simple, cheap, ubiquitous. Cons: scale varies by model/vendor. Use: normalize to a dimensionless
e; band onRSI, not raw probs. - MC-dropout — Pros: captures epistemic cheaply (no retrain). Cons: extra passes; depends on dropout config. Use: map variance/entropy as
e_inpenalty. - Deep ensembles — Pros: strong uncertainty in practice. Cons: costly to serve. Use: pool each model’s
ain u-space; same manifest across vendors. - Conformal — Pros: finite-sample coverage (exchangeability). Cons: set size coarse; calibration drift. Use: penalize large
|S|or high nonconformity viae_in. - Evidential — Pros: single pass; separates aleatoric/epistemic proxies. Cons: loss-sensitive. Use: map evidence strength to
e_out, penalties toe_in. - Calibrated prob (Platt/Isotonic) — Pros: better-behaved than raw softmax. Cons: can drift. Use: treat calibrated top-1 as support lens.
Invariant: Classical numbers remain untouched:phi((m,a)) = m.
H5) Worked minis (calculator-fast)
H5.1 Entropy lens (K = 5)
H = 0.900000
H_max = log(5) ≈ 1.609438
conf_H = 1 - H/H_max ≈ 0.440383
tau = 0.300000, Unit = 1.000000, c = 1.000000
e = 0.140383
a = tanh( c * e ) ≈ 0.139474 # → band A0
H5.2 MC-dropout variance penalty
var_mc = 0.250000, target s = 0.100000, Unit = 0.200000, c = 1.000000
e = (s - var_mc)/Unit = -0.750000
a = tanh(-0.750000) ≈ -0.635148 # penalty pushes negative
H5.3 Ensemble margin (3 models)
margins = [0.600000, 0.400000, 0.200000]
e_k = margin - 0.300000
a_k = tanh(e_k) = [0.291313, 0.099668, -0.099668]
U = 0.300000 + 0.100000 - 0.100000 = 0.300000
W = 3
a_pool = tanh(U/W) = tanh(0.100000) = 0.099668 # → band A0
H5.4 Conformal set-size penalty
|S| = 3, Unit = 2
e_in = (|S| - 1)/Unit = 1.000000
a_in = tanh(-1.000000) = -0.761594
a_out = tanh(0.800000) = 0.664037
U_in = 1.000000; V_out = 0.800000; W_in = 1
RSI = tanh( (V_out - U_in)/W_in ) = tanh(-0.200000) = -0.197375 # → band A-
H6) How to combine multiple methods cleanly (no double counting)
- Bundle by surface (e.g.,
bundle_decode = {entropy, margin, calibrated_prob}). - De-correlate highly related cues: either downweight (
w_small) or pre-combinee := w1*z1 + w2*z2 - w3*z3before mapping toa. - Declare weights once for comparability:
w := 1(uniform) orw := |m|^gamma. - Fuse only in u-space and choose by a bounded index (
RSIorRSI_env). - Acceptance: increasing a penalty must not raise alignment; shuffling inputs or shards must leave
RSIunchanged.
# Clean fusion template (copy-paste)
U = 0.0; W = 0.0
for cue in cues: # each cue -> contrast e_cue (dimensionless)
a = tanh(c * e_cue) # or two-channel mapping
x = max(-1+eps_a, min(1-eps_a, a))
U += w_cue * atanh(x)
W += w_cue
a_pool := tanh( U / max(W, eps_w) )
RSI := a_pool
RSI_env := g * RSI # or tanh(g * atanh(RSI))
One-line takeaway. Treat entropy, margins, MC-dropout, ensembles, conformal, and evidential signals as simple lenses feeding a single bounded lane; fuse fairly in u-space, decide by RSI/RSI_env, and keep all classical values pristine via phi((m,a)) = m.
Navigation
Previous: SSM-AI – Appendix H — Comparisons & Synergies (H1–H3)
Next: SSM-AI – Appendix H — Comparisons & Synergies (H7–H9)
Directory of Pages
SSM-AI — Table of Contents