Deterministic containment with additive Δ stack; never mutate m
Goal. Contain bad steps quickly, recover to last-known-good, and branch safely — all alignment-only. Classical magnitudes remain pristine via phi((m,a)) = m.
Triggers (declare once). Band breach, sharp drop ΔRSI_path <= delta_thr, gate shock g < g_min, policy hit, or budget guard.
Mechanism (additive Δ stack). Each step contributes a bounded increment in u-space. On failure, pop until safe.
# per-step contribution (after any gate/priors)
u'_s := atanh( clamp(RSI'_s, -1+eps_a, +1-eps_a) )
ΔU_s := w_s * u'_s
ΔW_s := w_s
# apply
U := U + ΔU_s
W := W + ΔW_s
RSI_path := tanh( U / max(W, eps_w) )
# rollback loop on breach
while RSI_path < band_min and stack not empty:
U := U - ΔU_top
W := W - ΔW_top
RSI_path := tanh( U / max(W, eps_w) )
Defaults: eps_a = 1e-6, eps_w = 1e-12. Choose band_min (e.g., "A0").
Branching after rollback. Compute an alternative step (u'_alt), push (ΔU_alt, ΔW_alt), and pick the branch with higher RSI_path (or RSI_plan_env if gating).
Stamp (include rollback fields)....|U_path=...|W_path=...|RSI_path=...|band=A0|rollback=2|cause=band_breach|last_ok=step_3|try=alt_4A|...
Worked numbers (continuing 5.2; 6-dec).
Before step 4: U=1.187535, W=3, RSI=0.376388 (A0)
Bad step 4: RSI'_4=-0.650000 -> u'_4=-0.775299 -> U=0.412236, W=4, RSI=0.102696 # breach
Rollback: pop step 4 -> U=1.187535, W=3, RSI=0.376388 # restored
Alt 4': RSI'_4'=+0.550000 -> u'_4'=0.618381 -> U=1.805916, W=4, RSI=0.423114 # higher
Pseudocode (deterministic, budget-aware).
class RollbackPath:
def __init__(self, band_min="A0", eps_a=1e-6, eps_w=1e-12, max_pops=3):
self.U=0.0; self.W=0.0; self.stack=[]
self.band_min=band_min; self.eps_a=eps_a; self.eps_w=eps_w; self.max_pops=max_pops
def _clamp(self, x): return max(-1+self.eps_a, min(1-self.eps_a, x))
def rsi(self): return 0.0 if self.W<=0 else tanh(self.U / max(self.W, self.eps_w))
def add_step(self, RSI_used, w=1.0, budget_ms=None, budget_tokens=None):
a = self._clamp(RSI_used)
u = atanh(a)
dU, dW = w*u, w
self.U += dU; self.W += dW; self.stack.append((dU,dW))
def rollback_until_safe(self, band_ok_fn):
pops = 0
while self.stack and not band_ok_fn(self.rsi()) and pops < self.max_pops:
dU,dW = self.stack.pop(); self.U -= dU; self.W -= dW; pops += 1
return pops
Policies (manifest sketch).
"rollback": {
"band_min": "A0",
"delta_thr": 0.25,
"g_min": 0.50,
"budget": {"tokens": 2.0e6, "ms": 15000},
"max_pops": 3,
"on_fail": "fallback_classical" # e.g., pick by highest m
}
Invariant: fallback reverts to classical logic (e.g., highest m); never edits m.
Acceptance checklist (pass/fail).
• Determinism: same inputs + manifest ⇒ same pops and result
• Boundedness: all RSIs strictly in (-1, +1)
• Auditability: stamps reconstruct U,W,RSI_path, pops, and cause
• Fallback purity: on ultimate fail, choose by classical policy; phi((m,a)) = m holds
• Budget adherence: pops and alternatives respect declared limits
One-line takeaway. Treat each step as an additive Δ in u-space; on breach, pop back to safety and try an alternate — deterministic containment with m unchanged and every move stamped.
Navigation
Previous: SSM-AI – Path Pooling & Reporting (5.2)
Next: SSM-AI – Plan-Level Priors & Gates (5.4)
Directory of Pages
SSM-AI — Table of Contents