Abstract
Information theory measures uncertainty and communication efficiency. In Shunyaya Symbolic Mathematics (SSM), every datum carries both magnitude m and alignment a, so we extend entropy, mutual information, and divergence to account for stability drift. Under collapse phi(m,a) = m, all results reduce to the classical theory.
Symbolic entropy (discrete)
For a symbolic random variable X with law P over pairs (m, a):
H_s(X) = - sum_{(m,a)} P(m,a) * log P(m,a)
Decompose by P(m,a) = P_m(m) * P_{a|m}(a|m):
H_s(X) = H(P_m) + E_m[ H(P_{a|m}(.|m)) ]
Interpretation
• H(P_m): uncertainty in magnitudes.
• E_m[ H(P_{a|m}) ]: uncertainty in alignment drift, conditioned on m.
Collapse: if all a = +1, then H_s(X) = H(P_m) (classical Shannon entropy).
Numeric example (logs base 2).X takes (1,+1), (1,0), (2,+1) with probabilities 0.4, 0.2, 0.4.P_m(1) = 0.6, P_m(2) = 0.4.
For m = 1: P_{a|1}(+1) = 2/3, P_{a|1}(0) = 1/3; for m = 2: P_{a|2}(+1) = 1.H(P_m) ≈ 0.971; H(P_{a|1}) ≈ 0.918; H(P_{a|2}) = 0.E_m[ H(P_{a|m}) ] = 0.6*0.918 + 0.4*0 = 0.551.
So H_s(X) ≈ 1.522 bits → about 0.551 bits come purely from alignment variability.
Continuous note (rapidity-aware differential entropy).
Let u = atanh(a) with standard clamp in applications. Densities relate by
p_{m,a}(m,a) = p_{m,u}(m,u) / (1 - a^2)
Hence the differential entropies satisfy (nats):
h_s(m,a) = h(m,u) + E[ log(1 - a^2) ]
This separates geometry in u from the alignment Jacobian.
Symbolic mutual information
For symbolic X and Y:
I_s(X;Y) = H_s(X) + H_s(Y) - H_s(X,Y)
Properties
• Non-negative and symmetric.
• Collapse: if all a = +1, reduces to classical mutual information.
Numeric example (alignment-only dependence).
Let X be uniform on {(1,+1), (1,-1)}; let Y copy the alignment bit only.H_s(X) = 1, H_s(Y) = 1, H_s(X,Y) = 1 ⇒ I_s(X;Y) = 1 bit.
Two signals with identical magnitudes can share one full bit entirely through alignment.
Decomposition (discrete).
I_s(X;Y) =
I(m_X ; m_Y)
+ E_{m_X,m_Y}[ I( a_X ; a_Y | m_X, m_Y ) ]
+ E_{m_Y}[ I( m_X ; a_Y | m_Y ) ]
+ E_{m_X}[ I( a_X ; m_Y | m_X ) ]
This makes explicit pure-alignment and mixed magnitude–alignment contributions.
Symbolic Kullback–Leibler divergence
For distributions P and Q over symbolic space:
D_s(P || Q) = sum_{(m,a)} P(m,a) * log( P(m,a) / Q(m,a) )
Factorization (discrete):
D_s(P||Q) = D( P_m || Q_m ) + E_{m~P_m}[ D( P_{a|m} || Q_{a|m} ) ]
So discrepancy is the sum of a magnitude part and a conditional alignment part.
Numeric example (same magnitudes, different alignments; bits).
Magnitudes identical: P_m = Q_m concentrated at m = 1.P_{a|1}(+1)=0.9, P_{a|1}(-1)=0.1; Q_{a|1}(+1)=0.5, Q_{a|1}(-1)=0.5.
D_s(P||Q) = 0.9*log2(0.9/0.5) + 0.1*log2(0.1/0.5) ≈ 0.531 bits
A pure alignment shift yields positive divergence even with identical magnitudes.
Symbolic channels and capacity
Given input X and output Y through a symbolic channel, define
C_s = max_{P(X)} I_s(X;Y)
Example: symbolic binary symmetric channel (SBSC).
Alphabet {(1,+1),(1,-1)}; with probability p the channel flips alignment (a -> -a).
Capacity:
C_s = 1 - H_2(p) (bits)
Collapse: identical to classical BSC capacity since the effective alphabet is the alignment bit.
Operational note
If a physical link preserves magnitude m but sometimes corrupts alignment a (e.g., phase jitter, timing skew), capacity loss is fully captured by the alignment flip probability p—classical amplitude metrics may report “perfect” transmission yet information is lost through alignment corruption.
Fano-style bound for alignment decoding (binary).
H( a | observation ) <= H_2(p_e)
⇒ I( a ; observation ) >= 1 - H_2(p_e) when H(a)=1 bit
Data-processing inequality (symbolic)
If X -> Y -> Z is a Markov chain through any symbolic channel, then
I_s(X;Z) <= I_s(X;Y)
Post-processing cannot increase information about either magnitude or alignment.
Rate–distortion (alignment-aware sketch)
For lossy coding of X = (m,a), pick a distortion such as
d( (m,a), (m_hat, a_hat) ) =
(m - m_hat)^2 + lambda * ( atanh(a) - atanh(a_hat) )^2, lambda >= 0
The rapidity embedding turns the alignment term into a Euclidean quadratic in u = atanh(a). Classical rate–distortion machinery then applies to the joint space; lambda tunes how costly alignment errors are.
Collapse consistency
If all alignments equal +1, then
• H_s, I_s, and D_s reduce to their classical counterparts on magnitudes.
• Channel capacity and rate–distortion reduce exactly to the classical formulas.
Communication and inference become centre-aware when alignment varies, while remaining collapse-safe.
Navigation
Previous → Symbolic Probability and Measure (2.23)
Next → Symbolic Fourier and Spectral Analysis
Disclaimer
Observation only. Results reproduce mathematically; domain claims require independent peer review. Defaults: mult_mode = M2, clamp_eps = 1e-6, |a| < 1 with rapidity u = atanh(a) where needed. All formulas are presented in plain text. Collapse uses phi(m,a) = m.