SSM-AI – Appendix B — Symbolic Search Lens: Definition & Setup (B1–B5)

A single, bounded scoring lens for search — shard-proof, comparable, and audit-ready.

Purpose. Provide one published lens to rank internet/intranet/local search results with bounded, comparable scores. Classical retrieval numbers remain intact: phi((m,a)) = m. The lens turns observable features into contrasts e, maps to alignments, and selects by a bounded chooser RSI ∈ (-1,+1) with optional gating.

B1) Feature normalization (to [0,1])
Declare and normalize once (no PII). Keep names, weights, and lambda in the manifest.

# Hit quality (per-engine quantiles or a small logistic)
hit_quality := clamp((score - p10) / max(p90 - p10, eps), 0, 1)

# Freshness (days → [0,1] via exponential decay)
freshness := exp(-lambda * age_days)   # choose lambda s.t. 7–30 days → ~0.3–0.6

# Semantic match (cosine ∈ [-1,1] → [0,1])
semantic_match := clamp((cosine + 1) / 2, 0, 1)

# Risk penalty (aggregate)
risk_penalty := clamp(w_tox*tox + w_pii*pii + w_outl*outlier + ..., 0, 1)

B2) Lens (declare once; dimensionless)

# Combined contrast (weights > 0, Unit > 0)
e := (alpha*hit_quality + beta*freshness + gamma*semantic_match - delta*risk_penalty) / Unit

# Split channels (recommended)
e_out := alpha*hit_quality + beta*freshness + gamma*semantic_match
e_in  := delta*risk_penalty

B3) Map → align → choose (bounded, order-invariant)

# Alignments (always clamp before atanh later)
a_in  := tanh(-c * e_in)
a_out := tanh(+c * e_out)

# Pool in rapidity (weights w; default w := 1)
U_in  := SUM w * atanh(a_in)
V_out := SUM w * atanh(a_out)
W_in  := SUM w

# Bounded chooser
RSI := tanh( (V_out - U_in) / max(W_in, eps_w) )

# Optional gate (calm mode)
RSI_env := g_t * RSI
# or curvature-preserving:
RSI_env := tanh( g_t * atanh(RSI) )

# Invariant (everywhere)
phi((m,a)) = m

B4) Federated/shard-proof pooling (meta-search)

# Pool within each engine s
U_in^s  := SUM_i atanh(a_in_i^s)
V_out^s := SUM_i atanh(a_out_i^s)
W_in^s  := SUM_i w_i^s

# Merge across engines (order/shard invariant)
U_in  := SUM_s U_in^s
V_out := SUM_s V_out^s
W_in  := SUM_s W_in^s
RSI := tanh( (V_out - U_in) / max(W_in, eps_w) )

B5) Manifest (copy-paste block)

"ssm_search": {
  "features": {
    "normalize": {
      "hit_quality": "quantile_minmax(p10,p90)",
      "freshness": "exp_decay(lambda)",
      "semantic_match": "cosine_to_unit",
      "risk_penalty": {"tox": 1.0, "pii": 1.0, "outlier": 0.5}
    },
    "params": {"lambda": 0.05}
  },
  "lens": {
    "alpha": 1.0, "beta": 0.5, "gamma": 0.7, "delta": 0.8,
    "Unit": 1.0, "c": 1.0
  },
  "weights": {"policy": "uniform"},
  "gate_ref": "gate_preset_A"
}

One-line takeaway. Normalize a few transparent features, build a single contrast e, map with tanh, and pick via rapidity-pooled RSI — results are bounded, comparable, shard-proof, and classical scores stay untouched by phi((m,a)) = m.

Navigation
Previous: SSM-AI – Appendix A — Bands, Stamps & Checklist (A4–A7)
Next: SSM-AI – Appendix B — Symbolic Search Lens: Pseudocode, Examples & Policies (B6–B10)

Directory of Pages
SSM-AI — Table of Contents