7.4 — SSACC tile (streaming, order-invariant)
• State. U, W.
• Update (per sample). U += w*atanh(clamp(a)), W += w.
• Flush. a_out := tanh( U / max(W, eps_w) ).
• Reset. Zero U,W (or snapshot for rolling windows).
• Guarantee. batch == stream == shuffled for the lane under pure {U,W} fusion.
• Optional decay (time-weighted window). U := rho*U + w*atanh(a), W := rho*W + w, with fixed rho in (0,1).
Note: decay is time-weighted and order-sensitive by design; do not use it for E5 conformance.
7.5 — ISA-like operations (scalar first, then SIMD)
Core scalar ops (semantics).
• Mapping. SATANH a->u, STANH u->a, SCLAMP a->a_clamped.
• Compose. SUADD u1,u2->u, SUSUB u1,u2->u, SUSCALE u,r->u.
• Arithmetic pairs.SMUL_SYM (m1,a1),(m2,a2) -> (m1*m2, a_out) with a_out := tanh(atanh(a1)+atanh(a2)).SDIV_SYM (m_f,a_f),(m_g,a_g) -> (m_f/m_g, a_out) with a_out := tanh(atanh(a_f)-atanh(a_g)).
• Streaming. SSUM_UW a,w -> {U+=w*atanh(clamp(a)); W+=w}, SFLUSH_UW -> a_out := tanh(U/max(W,eps_w)).
• Utility. SCOLLAPSE (m,a)->m, SSCALE (m,a),k -> (k*m, sign(k)*a).
Immediate forms. SUSCALEI u,imm_r->u, SSUM_UWI a,imm_w->{U,W}.
SIMD extensions. V* forms with width N in {4,8,16} and fused patterns for dot/fold.
Tiny encoding sketch (example).[31:28] OPCODE | [27:24] SUB | [23:16] RD | [15:8] RS1 | [7:0] RS2/IMM8OPCODES: 0x1 SATANH, 0x2 STANH, 0x3 SUADD, 0x4 SUSUB, 0x5 SUSCALE(I),0x6 SMUL_SYM, 0x7 SDIV_SYM, 0x8 SSUM_UW(I), 0x9 SFLUSH_UW,0xA SCOLLAPSE, 0xB SCLAMP
Optional rational fast-path (manifested).SMULR_SYM uses a_out := (a1+a2)/(1+a1*a2) and SDIVR_SYM uses a_out := (a_f-a_g)/(1-a_f*a_g) after clamping.
Manifest knob: lane_compose_mode = uspace|m2_rational. Pick one for a build; do not mix.
7.6 — Register map (MMIO, minimal)
Numerical knobs. EPS_A, EPS_W, DENOM_SOFT_MIN, GAMMA.
Policy. DIVISION_POLICY in {strict, soft, meadow}, GATE_ENABLE, GATE_GAIN.
Bands. BAND_TBL[5] thresholds for A++, A+, A0, A-, A--.
LUT IDs. LUT_TANH_ID, LUT_ATANH_ID, with LUT_MAX_ERR_A.
SSACC. U, W, CMD in {ACCUM, FLUSH, RESET}.
Manifest. BUILD_ID, KNOBS_HASH, CONFORMANCE_CSUM.
Example offsets (byte; split across lines for readability).0x00 EPS_A 0x04 EPS_W 0x08 DENOM_SOFT_MIN 0x0C GAMMA0x10 DIVISION_POLICY 0x14 GATE_ENABLE 0x18 GATE_GAIN0x20 BAND_TBL0..4 (5 * 4 bytes)0x40 U 0x44 W 0x48 CMD 0x4C LANE_MODE0x50 LUT_TANH_ID 0x54 LUT_ATANH_ID 0x58 LUT_MAX_ERR_A0x60 BUILD_ID 0x70 KNOBS_HASH 0x90 CONFORMANCE_CSUM
Rule. Changing any knob or LUT invalidates bit-identical replay; publish a new manifest.
Navigation
Back: Shunyaya Symbolic Mathematical Hardware – Silicon: Targets, Datapath & Pipeline (7.1–7.3)
Next: Shunyaya Symbolic Mathematical Hardware – Nonlinear Units, Timing & Quantization (7.7–7.9)