Shunyaya Symbolic Mathematical Chemistry – Data Modes & Multi-Step Policy (4.5–4.6)

Why this page. Two practical ways to build the contrast e when your datasets vary—from quick, table-free counts (data-light) to explicit bond energies (data-rich)—plus a single, consistent rule for multi-step routes: accumulate contrast first, then assign alignments once.


Data-light mode (fast, table-free direction)

Use when: detailed bond energies aren’t available but you still want sign-correct directionality and a bounded RSI.

  • Simple contrast from counts. Pick positive per-bond averages (k_form, k_break) and compute a one-line e_hat.
  • Direction you can trust. The inequality alone sets the sign; magnitudes are illustrative.
  • Discipline. Fix (k_form, k_break) once per study and publish them with the counting rules.
  • Same bounded pipeline. Map to alignments with the symmetric rule, then compute RSI with guards.

Good for: rapid triage, early scoping, pedagogy, and table-sparse domains.
Caveats: not a substitute for curated bond dictionaries; keep priors/gates separate from e_hat.


Data-rich mode (explicit bonds, reproducible)

Use when: you have a consistent bond-energy library or formation-energy scheme.

  • One library per study. Publish library name/version and any unit conversions.
  • Canonical map. Tally E_broken, E_formed, form e, then apply the same symmetric alignment rule and RSI as everywhere else.
  • Invariants hold. Boundedness, collapse safety, sign invariance under positive rescalings, and robustness-grid checks.

Good for: benchmarks, appendix-grade worked sets, and program decisions.


Choosing between modes (quick guide)

  • Start data-light to validate signs and workflow;
  • Upgrade to data-rich for calibration, publication, and acceptance thresholds;
  • Never mix lenses within a study; keep the chosen lens and scales fixed and exposed in the manifest.

Multi-step policy (accumulate then align)

Rule of thumb: Sum contrasts in energy space across steps, then apply the alignment map once to the total. This matches rapidity additivity and preserves the Sign Lemma and monotonicity.

  • Don’t assign per-step alignments and then re-assign at route level;
  • Do keep the same lens and Unit for every step in the route;
  • Within a step, use M1/M2 consistently for substructures; the route-level contrast still accumulates in energy space.

Publish: route definition (ordered steps), lens name, Unit, c, eps_a, sources for any estimated energies.


Plain ASCII formulas & snippets (copy-ready)

# 4.5 DATA-LIGHT (table-free)
# Count formed/broken units with fixed positive coefficients.
e_hat = k_form * N_formed  -  k_break * N_broken      # k_form > 0, k_break > 0

# Direction rule:
# if N_formed > (k_break / k_form) * N_broken  =>  e_hat > 0 (forward favored)

# Map to bounded alignments (same policy as everywhere)
a_r = tanh(-c * e_hat)
a_p = tanh(+c * e_hat)
a_r = (1 if a_r>=0 else -1) * min(abs(a_r), 1 - eps_a)
a_p = (1 if a_p>=0 else -1) * min(abs(a_p), 1 - eps_a)

# 4.5 DATA-RICH (explicit bonds)
E_broken = sum_over_bonds_broken(B_bond)
E_formed = sum_over_bonds_formed(B_bond)
e        = (E_formed - E_broken) / E_unit            # E_unit > 0

a_r = tanh(-c * e)
a_p = tanh(+c * e)
a_r = (1 if a_r>=0 else -1) * min(abs(a_r), 1 - eps_a)
a_p = (1 if a_p>=0 else -1) * min(abs(a_p), 1 - eps_a)

# 4.5 Minimal pseudocode (both modes share the same bounded map)
# DATA-LIGHT
input: N_formed, N_broken, k_form>0, k_break>0, c>0, eps_a>0
e := k_form * N_formed - k_break * N_broken
a_r := tanh(-c * e)
a_p := tanh(+c * e)
a_r := (1 if a_r>=0 else -1) * min(abs(a_r), 1 - eps_a)
a_p := (1 if a_p>=0 else -1) * min(abs(a_p), 1 - eps_a)

# DATA-RICH
input: bonds_formed[], bonds_broken[], B_bond(.), E_unit>0, c>0, eps_a>0
E_formed := sum( B_bond(b) for b in bonds_formed )
E_broken := sum( B_bond(b) for b in bonds_broken )
e := (E_formed - E_broken) / E_unit
a_r := tanh(-c * e)
a_p := tanh(+c * e)
a_r := (1 if a_r>=0 else -1) * min(abs(a_r), 1 - eps_a)
a_p := (1 if a_p>=0 else -1) * min(abs(a_p), 1 - eps_a)

# Manifest (publish)
# Data-light: k_form, k_break, rationale ("per-bond averages / Unit from pilot set"), counting rules.
# Data-rich : library/scheme name+version for B_bond, E_unit, rounding policy.
# In both: keep lens fixed; run robustness grid (no sign flips).

# 4.6 MULTI-STEP (accumulate then align)
# Steps s = 1..S with the same lens and Unit:
e_total = ( sum_s E_formed[s]  -  sum_s E_broken[s] ) / E_unit

a_react = tanh( -c * e_total )
a_prod  = tanh( +c * e_total )

# Clamp
a_react = (1 if a_react >= 0 else -1) * min(abs(a_react), 1 - eps_a)
a_prod  = (1 if a_prod  >= 0 else -1) * min(abs(a_prod),  1 - eps_a)

# Minimal pseudocode
input:
  steps s = 1..S
  E_formed[s], E_broken[s] for each step (same lens and Unit)
  E_unit > 0, c > 0, eps_a > 0

E_formed_sum := 0
E_broken_sum := 0
for s in 1..S:
  E_formed_sum := E_formed_sum + E_formed[s]
  E_broken_sum := E_broken_sum + E_broken[s]

e_total := (E_formed_sum - E_broken_sum) / E_unit

a_react := tanh(-c * e_total)
a_prod  := tanh(+c * e_total)

a_react := (1 if a_react >= 0 else -1) * min(abs(a_react), 1 - eps_a)
a_prod  := (1 if a_prod  >= 0 else -1) * min(abs(a_prod),  1 - eps_a)

return a_react, a_prod


Navigation
Previous – Equilibrium-Derived Lenses (4.4B)
Next – M1/M2 Consistency & Priors (4.7–4.8)