Shunyaya Symbolic Mathematics — Symbolic Information Theory (2.24)

Abstract
Information theory measures uncertainty and communication efficiency. In Shunyaya Symbolic Mathematics (SSM), every datum carries both magnitude m and alignment a, so we extend entropy, mutual information, and divergence to account for stability drift. Under collapse phi(m,a) = m, all results reduce to the classical theory.


Symbolic entropy (discrete)

For a symbolic random variable X with law P over pairs (m, a):

H_s(X) = - sum_{(m,a)} P(m,a) * log P(m,a)

Decompose by P(m,a) = P_m(m) * P_{a|m}(a|m):

H_s(X) = H(P_m) + E_m[ H(P_{a|m}(.|m)) ]

Interpretation
H(P_m): uncertainty in magnitudes.
E_m[ H(P_{a|m}) ]: uncertainty in alignment drift, conditioned on m.
Collapse: if all a = +1, then H_s(X) = H(P_m) (classical Shannon entropy).

Numeric example (logs base 2).
X takes (1,+1), (1,0), (2,+1) with probabilities 0.4, 0.2, 0.4.
P_m(1) = 0.6, P_m(2) = 0.4.
For m = 1: P_{a|1}(+1) = 2/3, P_{a|1}(0) = 1/3; for m = 2: P_{a|2}(+1) = 1.
H(P_m) ≈ 0.971; H(P_{a|1}) ≈ 0.918; H(P_{a|2}) = 0.
E_m[ H(P_{a|m}) ] = 0.6*0.918 + 0.4*0 = 0.551.
So H_s(X) ≈ 1.522 bits → about 0.551 bits come purely from alignment variability.

Continuous note (rapidity-aware differential entropy).
Let u = atanh(a) with standard clamp in applications. Densities relate by

p_{m,a}(m,a) = p_{m,u}(m,u) / (1 - a^2)

Hence the differential entropies satisfy (nats):

h_s(m,a) = h(m,u) + E[ log(1 - a^2) ]

This separates geometry in u from the alignment Jacobian.


Symbolic mutual information

For symbolic X and Y:

I_s(X;Y) = H_s(X) + H_s(Y) - H_s(X,Y)

Properties
• Non-negative and symmetric.
• Collapse: if all a = +1, reduces to classical mutual information.

Numeric example (alignment-only dependence).
Let X be uniform on {(1,+1), (1,-1)}; let Y copy the alignment bit only.
H_s(X) = 1, H_s(Y) = 1, H_s(X,Y) = 1I_s(X;Y) = 1 bit.
Two signals with identical magnitudes can share one full bit entirely through alignment.

Decomposition (discrete).

I_s(X;Y) =
  I(m_X ; m_Y)
+ E_{m_X,m_Y}[ I( a_X ; a_Y | m_X, m_Y ) ]
+ E_{m_Y}[ I( m_X ; a_Y | m_Y ) ]
+ E_{m_X}[ I( a_X ; m_Y | m_X ) ]

This makes explicit pure-alignment and mixed magnitude–alignment contributions.


Symbolic Kullback–Leibler divergence

For distributions P and Q over symbolic space:

D_s(P || Q) = sum_{(m,a)} P(m,a) * log( P(m,a) / Q(m,a) )

Factorization (discrete):

D_s(P||Q) = D( P_m || Q_m ) + E_{m~P_m}[ D( P_{a|m} || Q_{a|m} ) ]

So discrepancy is the sum of a magnitude part and a conditional alignment part.

Numeric example (same magnitudes, different alignments; bits).
Magnitudes identical: P_m = Q_m concentrated at m = 1.
P_{a|1}(+1)=0.9, P_{a|1}(-1)=0.1; Q_{a|1}(+1)=0.5, Q_{a|1}(-1)=0.5.

D_s(P||Q) = 0.9*log2(0.9/0.5) + 0.1*log2(0.1/0.5) ≈ 0.531 bits

A pure alignment shift yields positive divergence even with identical magnitudes.


Symbolic channels and capacity

Given input X and output Y through a symbolic channel, define

C_s = max_{P(X)} I_s(X;Y)

Example: symbolic binary symmetric channel (SBSC).
Alphabet {(1,+1),(1,-1)}; with probability p the channel flips alignment (a -> -a).
Capacity:

C_s = 1 - H_2(p)   (bits)

Collapse: identical to classical BSC capacity since the effective alphabet is the alignment bit.

Operational note
If a physical link preserves magnitude m but sometimes corrupts alignment a (e.g., phase jitter, timing skew), capacity loss is fully captured by the alignment flip probability p—classical amplitude metrics may report “perfect” transmission yet information is lost through alignment corruption.

Fano-style bound for alignment decoding (binary).

H( a | observation ) <= H_2(p_e)
⇒ I( a ; observation ) >= 1 - H_2(p_e)   when H(a)=1 bit


Data-processing inequality (symbolic)

If X -> Y -> Z is a Markov chain through any symbolic channel, then

I_s(X;Z) <= I_s(X;Y)

Post-processing cannot increase information about either magnitude or alignment.


Rate–distortion (alignment-aware sketch)

For lossy coding of X = (m,a), pick a distortion such as

d( (m,a), (m_hat, a_hat) ) =
  (m - m_hat)^2 + lambda * ( atanh(a) - atanh(a_hat) )^2,  lambda >= 0

The rapidity embedding turns the alignment term into a Euclidean quadratic in u = atanh(a). Classical rate–distortion machinery then applies to the joint space; lambda tunes how costly alignment errors are.


Collapse consistency

If all alignments equal +1, then
H_s, I_s, and D_s reduce to their classical counterparts on magnitudes.
• Channel capacity and rate–distortion reduce exactly to the classical formulas.
Communication and inference become centre-aware when alignment varies, while remaining collapse-safe.


Navigation
Previous → Symbolic Probability and Measure (2.23)
Next → Symbolic Fourier and Spectral Analysis


Disclaimer
Observation only. Results reproduce mathematically; domain claims require independent peer review. Defaults: mult_mode = M2, clamp_eps = 1e-6, |a| < 1 with rapidity u = atanh(a) where needed. All formulas are presented in plain text. Collapse uses phi(m,a) = m.