Skip to content

cross_validation

Sampling, Validation and ML Diagnostics

Time-dependent labels violate IID assumptions; purging/embargoing reduces leakage bias.

Ttrain=T(Tteste)\mathcal{T}_{train}=\mathcal{T}\setminus(\mathcal{T}_{test}\oplus e)

e=pTe=\lfloor p\cdot T\rfloor

use openquant::cross_validation::PurgedKFold;
let cv = PurgedKFold::new(5, 0.01);
  • ml_cross_val_score
  • ml_get_train_times
  • PurgedKFold
  • Scoring
  • Always align event end-times when purging.
  • Report variance across folds, not only mean score.