By AFML Chapter
Chapter-to-Module Mapping
Section titled “Chapter-to-Module Mapping”Chapter 2: Financial Data Structures
Section titled “Chapter 2: Financial Data Structures”Information-driven bars replace fixed-time sampling with activity-based aggregation, producing returns that are closer to IID normal. CUSUM and z-score filters extract structurally meaningful events from the bar stream.
data_structures— bar construction (dollar, volume, tick, imbalance, run bars). Entry point for raw trade data.filters— CUSUM and z-score event filtersetf_trick— synthetic ETF and futures roll utilities
Chapter 3: Labeling
Section titled “Chapter 3: Labeling”Triple-barrier labeling converts filtered events into ML labels with controlled profit-taking, stop-loss, and holding-period barriers. Meta-labeling separates direction from sizing.
labeling— triple-barrier and meta-labeling. Entry point for label generation.bet_sizing— probability-to-position-size conversion
Chapter 4: Sample Weights and Uniqueness
Section titled “Chapter 4: Sample Weights and Uniqueness”Overlapping labels violate IID assumptions. Sequential bootstrap and uniqueness weighting correct for this by measuring and accounting for label overlap.
sampling— indicator matrix and sequential bootstrap. Entry point for overlap-aware sampling.sample_weights— uniqueness and time-decay weightingsb_bagging— sequentially bootstrapped bagging ensembles
Chapter 5: Fractional Differentiation
Section titled “Chapter 5: Fractional Differentiation”Fractional differencing finds the minimum transformation order that achieves stationarity while preserving predictive long-memory in price series.
fracdiff— FFD and standard fractional differencing. Entry point.
Chapter 6: Ensemble Methods
Section titled “Chapter 6: Ensemble Methods”Bias-variance decomposition and bagging diagnostics determine whether bagging or boosting improves ensemble quality under financial label structure.
ensemble_methods— bias/variance diagnostics, aggregation, bagging-vs-boosting recommendation
Chapter 7: Cross Validation in Finance
Section titled “Chapter 7: Cross Validation in Finance”Standard k-fold CV leaks information through overlapping labels. Purged k-fold with embargo removes this leakage source.
cross_validation— purged k-fold CV with embargo. Entry point.backtesting_engine— walk-forward and CPCV validation
Chapter 8: Feature Importance
Section titled “Chapter 8: Feature Importance”Multiple importance methods (MDI, MDA, SFI) are needed to detect substitution effects and unstable features before deploying models.
feature_importance— MDI, MDA, single-feature importance (Rust)fingerprint— model fingerprinting for partial and pairwise effects
Chapter 9: Hyper-parameter Tuning
Section titled “Chapter 9: Hyper-parameter Tuning”Tuning must use purged CV to avoid leakage-inflated scores. Randomized search is preferred for large parameter spaces.
hyperparameter_tuning— grid/randomized search with purged CV scoring
Chapters 10-12: Position Sizing and Robust Backtesting
Section titled “Chapters 10-12: Position Sizing and Robust Backtesting”Backtesting is a scenario sanity check, not a performance estimator. CPCV provides path distributions instead of point estimates.
bet_sizing— dynamic and reserve sizing for executionbacktesting_engine— walk-forward, purged CV, and CPCV. Entry point for validation.
Chapter 13: Synthetic Backtesting
Section titled “Chapter 13: Synthetic Backtesting”Selecting trading rules on a single historical path overfits. Synthetic path ensembles from calibrated O-U processes test rule robustness.
synthetic_backtesting— O-U calibration, PT/SL mesh search, stability diagnostics
Chapters 14-15: Diagnostics and Strategy Risk
Section titled “Chapters 14-15: Diagnostics and Strategy Risk”Strategy risk (probability of failing a Sharpe target) is distinct from portfolio risk (VaR, ES, drawdown).
backtest_statistics— Sharpe, deflated Sharpe, drawdown, holding periodrisk_metrics— VaR, Expected Shortfall, Conditional Drawdown Riskstrategy_risk— strategy failure probability diagnostics
Chapter 16: Portfolio Construction
Section titled “Chapter 16: Portfolio Construction”Hierarchical methods (HRP, HCAA) avoid covariance inversion fragility. CLA solves constrained mean-variance problems exactly.
hrp— Hierarchical Risk Parityhcaa— Hierarchical Clustering Asset Allocationonc— Optimal Number of Clustersportfolio_optimization— mean-variance allocators (min-vol, max-Sharpe, efficient risk)cla— Critical Line Algorithm
Chapters 17-19: Microstructure, Dependence, and Regime Detection
Section titled “Chapters 17-19: Microstructure, Dependence, and Regime Detection”Microstructure features capture liquidity and order-flow dynamics invisible in OHLC bars. Structural break detection flags regime changes that invalidate model assumptions.
structural_breaks— Chow, CUSUM variants, SADF bubble detectionmicrostructural_features— Kyle/Amihud impact, VPIN, entropycodependence— distance correlation, mutual information, variation of information
Chapters 20-22: HPC and Advanced
Section titled “Chapters 20-22: HPC and Advanced”Atom/molecule parallelism scales independent computations. Streaming analytics maintain bounded-memory indicators for real-time early warning.
hpc_parallel— partition and dispatch utilitiescombinatorial_optimization— integer optimization with exact baselinesstreaming_hpc— VPIN/HHI streaming with bounded-memory state
See the full per-module detail in the Module Reference Index.