SSM-AI – Limits & Failure Modes (1.6)

Read this before piloting — what can go wrong, how to detect it, and how to recover (without touching m)

Low-signal lenses (weak or noisy evidence).
When the chosen lens is weak, a ~ 0 and bands hover near A0. This is not an error; it is a “not enough evidence” signal.
Mitigation: strengthen or replace the lens; keep phi((m,a)) = m and route by classical logic when |a| < a_min.

evidence_ok := (|a| >= a_min)            # e.g., a_min = 0.15
route_by_lane := evidence_ok
fallback := phi((m,a))                   # always equals m

Division near zero (ratios).
Ratios with tiny denominators cause control risk, not value risk. Lane composition is stable in u-space, but policy must guard actuation.
Policy (default): division_policy = "strict". If |m_den| <= eps_div, choose no-lane actuation (fall back to classical), log + stamp, continue observation.

if |m_den| <= eps_div:
    use_lane_for_actuation := False      # strict
    note := "DIV_GUARD"
# lane math may still be logged: a_div := tanh(atanh(a_num) - atanh(a_den))

Order/partition mistakes (averaging in a-space).
Naive averages like a_out := SUM(w*a)/SUM(w) will not be order/shard invariant and may saturate near edges.
Mitigation: always use U/W fuse in u-space.

U += w*atanh(a_c)         # a_c := clamp(a, -1+eps_a, +1-eps_a)
W += w
a_out := tanh( U / max(W, eps_w) )

Gate misuse (accidentally mutating m).
g_t scales alignment only. If a pipeline accidentally multiplies or rewrites m, parity breaks.
Guardrail: enforce and test collapse parity on every hop.

assert phi((m,a)) == m     # must hold everywhere
RSI_env := g_t * RSI       # never g_t * m

Manifest drift (silent knob changes).
Changing bands, clamps, or weights without bumping the manifest causes irreproducible results.
Mitigation: treat the manifest as a contract; compute a knobs hash and stamp it.

knobs_hash := sha256( ascii(canonical_json(knobs)) )    # bands, eps_a, eps_w, gamma, division_policy, lens params…

Throughput hotspots (atanh/tanh at scale).
Very high QPS can stress scalar math. Semantics stay identical, but implementations may lag.
Mitigation: vectorize, use approximate LUTs with bounded error, or map to SSMH; do not change formulas.

# identical semantics; implementation choice only
u := atanh(a_c)  # vectorized / LUT
a := tanh(u)

Lens scaling mistakes (unit/scale drift).
Forgetting to declare Unit or scale c yields inconsistent bands across services.
Mitigation: pin Unit and c in the manifest; add a per-release calibration page with worked vectors.

Band churn (flicker at thresholds).
When a hovers near edges, labels can flicker.
Mitigation: use gentle hysteresis (promote/demote gates) and display confidence bands only after smoothing.

promote if delta_a >= +0.05
demote  if delta_a <= -0.05

Edge saturation (a too close to ±1).
Inputs from aggressive lenses can push a to ±1, where atanh diverges.
Mitigation: clamp first with a small margin; keep dtype-aware epsilons.

a_c := clamp(a, -1+eps_a, +1-eps_a)      # eps_a = 1e-6 (f32) or smaller for f64

PII or unsafe lenses.
Lenses must not encode personal or sensitive signals.
Mitigation: restrict to aggregate/telemetry metrics; document sources; run a PII check in CI.

QA Checklist (paste-ready, pass/fail)

C1 Collapse parity:            for all rows, phi((m,a)) == m
C2 Clamp safety:               |clamp(a)| < 1 with eps_a declared
C3 Order invariance:           batch == stream == shuffled via U/W
C4 Division policy:            strict/meadow/soft declared; strict blocks actuation near zero
C5 Gate purity:                RSI_env == g_t * RSI; m untouched
C6 Manifest reproducibility:   knobs_hash changes iff any knob changes
C7 Bands sanity:               thresholds + hysteresis applied as declared
C8 No PII:                     lenses documented; static checks passed
C9 Performance parity:         vectorized/LUT impls match scalar within tolerance

Rollback Plan (keep m pristine)

if any(C1..C9 == FAIL):
    # Immediate rollback without user impact
    use_lane_for_actuation := False
    selection := classical_policy(m)      # e.g., argmax(m) or baseline heuristic
    log/stamp decision with reason

One-line takeaway.
SSM-AI is observation-first: when anything looks uncertain—weak lens, tiny denominators, order bugs, or gate misuse—fall back to classical m (unchanged by phi((m,a)) = m), log the lane for learning, and fix knobs under a stamped manifest.

Navigation
Previous: SSM-AI – Outcomes to Track on Day 1 (1.5)
Next: SSM-AI – Canon — Numerals, Operators, Pools (2)

Directory of Pages
SSM-AI — Table of Contents