SSM-JTK – Data & Calibration — OLS target, BIC, event-aware loss, selection (2.7)

Time anchor (reminder)
Use the midpoint anchor from training.
t0 = midpoint(train_start, train_stop) ; t = days_since(date, t0)

Unwrapped OLS targets
Fit on the continuous series (unwrap first; wrap only for display).

fixed-n: y(t) = L_actual_unwrapped(t) − n*t
free-n : y(t) = L_actual_unwrapped(t)

Regressor design for a candidate Ω = {w_k}

fixed-n: X(t) = [ 1 , sin(w1*t), cos(w1*t), ... , sin(wm*t), cos(wm*t) ]
free-n : X(t) = [ 1 , t , sin(w1*t), cos(w1*t), ... , sin(wm*t), cos(wm*t) ]

Estimation (OLS + BIC with clamp)
beta_hat = argmin_beta || y − X*beta ||_2^2
RSS = || y − X*beta_hat ||_2^2 ; k = ncols(X) ; N = nrows(X)
BIC = k*log(N) + N*log( max(RSS/N, 1e-16) ) (lower is better)

Event-aware training loss (degree still dominant)
Let L_hat_unwrapped be the model on the unwrapped series.

Speed (central differences): v_hat = | d/dt L_hat_unwrapped | ; v_act = | d/dt L_actual_unwrapped |
Speed MAE: mae_v = mean( | v_hat − v_act | )
Degree MAE on train: MAE_deg_train
Cusp MAE on train (ASCII): define cusp_dist_deg(x) = min( (x % 30) , 30 − (x % 30) ), then average |cusp(model) − cusp(actual)|.
Combined loss:
loss = 1.0*MAE_deg_train + 0.3*cusp_MAE_train + 0.4*mae_v

Admissibility gate (strict parsimony)
ADMISSIBLE iff (BIC_EXTRA ≤ BIC_BASE − 6.0) and (loss_EXTRA < loss_BASE).

Selection rule
Pick the admissible model with the smallest (loss, then BIC); otherwise fall back to BASE. Keep t0 and the time convention fixed across comparisons.

Notes & tips

Use the same train/test windows and sampling mode as declared in §2.5–2.6; no leakage.
Compute derivatives via central differences (Δt=1 on daily grids; one-sided at ends).
Treat ΔBIC ≥ 6 as a decisive improvement; otherwise reject extras.
Maintain the midpoint anchor to reduce phase coupling and edge bias.

Navigation
Back: SSM-JTK – Data & Calibration — Kernel families & carriers (2.6)
Next: SSM-JTK – Data & Calibration — Calibration pseudocode (end-to-end) (2.7A)