Train/Test windows (disjoint spans)
Declare two non-overlapping spans and keep the same daily timestamp and sampling mode across both.train_window = [train_start, train_stop] (e.g., 2020-01-01..2024-12-31)test_window = [test_start, test_stop ] (e.g., 2015-01-01..2019-12-31)
- No leakage. Do not mix rows across windows.
- Same clock. Use the same daily timestamp (e.g., 05:30 IST or 00:00 UTC) for all data.
- Same schedule. If you trained on
daily, evaluate ondailyfor fair comparisons.
Midpoint anchor (numerical stability)
Anchor time at the midpoint of the training span and index all days as offsets from that anchor.t0 = midpoint(train_start, train_stop)t = days_since(date, t0)
A simple midpoint is: midpoint(a,b) = a + 0.5*(b − a) (in days).
Why midpoint?
- Reduces phase coupling between small harmonics and the linear trend.
- Mitigates edge bias in OLS when the span is long.
- Improves conditioning (lower correlation between
a0andnin free-n fits). - Keeps
tvalues balanced around zero, which helps both fit and BIC selection.
Practical tips
- Slow outers. Prefer longer benches for training; keep the test window disjoint but representative.
- Inners & retro loops. Ensure the train window includes at least one clean retro cycle.
- Freeze
t0. When you adjust the train span or sampling, keep the samet0if you want apples-to-apples comparisons; otherwise re-derivet0and document it. - Reproducibility. Record
train_range,test_range, sampling mode, andt0in your report so anyone can regenerate your metrics.
Evaluator reminder (for either family)L_hat_deg(date) = wrap360( a0_deg + n_deg_per_day*t + sum_k[ c_k*sin(w_k*t) + d_k*cos(w_k*t) ] ) with t = days_since(date, t0).
Navigation
Back: SSM-JTK – Data & Calibration — Sampling modes + Guardrails (2.4)
Next: SSM-JTK – Data & Calibration — Kernel families & carriers (2.6)