SSM-AI – Scalability & Precision: Long Paths + Dtype Guardrails (7.1-7.3)

Carry (U,W), clamp before atanh, keep m untouched.

7.1 Long-Path Guidance (agents, streams, shards)
Carry only (U, W); never just a_out. Merge shards by summing U and W, then invert once.
Checkpoint/rollback with additive deltas. Store per-step ΔU := w*u, ΔW := w to undo exactly (see earlier checkpoint logic).
Chunking doesn’t matter. Order/shard invariance holds because composition is additive in u := atanh(a).
Precision tip. For very long runs, optional pairwise/Kahan-style summation on U reduces float error; W can use standard summation.

# Streaming fuse (order/shard invariant)
U += w*atanh(a)            # with a pre-clamped
W += w
a_out := tanh( U / max(W, eps_w) )

# Shard merge
U_total := SUM_shards U_k
W_total := SUM_shards W_k
a_out := tanh( U_total / max(W_total, eps_w) )

# Checkpoint-friendly deltas
ΔU := w*atanh(a)
ΔW := w

7.2 Dtype & Epsilon (recommended defaults)
Clamp margin for alignment eps_a: float32 → 1e-6, float64 → 1e-12.
Denominator guard for means eps_w: float32 → ≥ 1e-8, float64 → 1e-12.
Gate epsilon eps_g (all dtypes): 1e-12.
Safe atanh input (always clamp). Keep |a| < 1 - eps_a in all lanes and choosers.
When to prefer float64. Paths > 10^3 steps, wide dynamic w, cross-vendor bake-offs, or CPU offline analytics.
Zero-evidence guard. If W_in == 0, set RSI := 0, band := "A0", reason insufficient_evidence.
Collapse parity. phi((m,a)) = m under all dtype settings.

# Dtype-aware clamps
a_c := clamp(a, -1+eps_a, +1-eps_a)   # eps_a: 1e-6 (f32), 1e-12 (f64)

# Streaming fuse recall (calculator-fast)
U += w*atanh(a_c)
W += w
a_out := tanh( U / max(W, eps_w) )    # eps_w: ≥1e-8 (f32), 1e-12 (f64)

# Zero-evidence handling
if W == 0:
  RSI  := 0.0
  band := "A0"   # insufficient_evidence

# Collapse parity (must always hold)
phi((m,a)) = m

7.3 Stability Near Edges (a → ±1)
Never feed raw ±1. Always clamp first: a_c := clamp(a, -1+eps_a, +1-eps_a).
Curvature awareness. atanh(a) grows rapidly near edges; keep lenses in the responsive band (typical |c*e| in [0.3, 1.2]).
Lane mul/div policy (M2). Division policy acts on routing only; never on m.

# Lane M2 (lane-only; magnitudes are classical)
a_mul := tanh( atanh(a1) + atanh(a2) )
a_div := tanh( atanh(a1) - atanh(a2) )

# Division policy (control, not value)
division_policy := "strict"  # default
# Near-zero denominators: fall back to classical actuation; still stamp the lane.

# High-confidence gating with curvature preservation
# mode "u_scale":
RSI_env := tanh( g_t * atanh(RSI) )

One-line takeaway. Scale confidently: sum in u, carry (U,W), clamp before atanh, and keep epsilons dtype-appropriate — long, sharded paths remain bounded, reproducible, and fast, with phi((m,a)) = m always preserved.


Navigation
Previous: SSM-AI – Reproducibility: Five-Step Replay from Stamps (6.5)
Next: SSM-AI – Scalability: HW Parity & Performance (7.4, 7.5)


Directory of Pages
SSM-AI — Table of Contents