data_structures
Constructs standard/time/run/imbalance bars from trade streams.
Each module now has a dedicated page with mathematical definitions, API surface, and implementation-oriented code examples.
| Divider | AFML Orientation |
|---|---|
| Event-Driven Data and Labeling | Event bars, filters, labeling, and sample weighting. |
| Sampling, Validation and ML Diagnostics | Leakage-aware validation, bootstrap, and model diagnostics. |
| Position Sizing and Trade Construction | Sizing transforms, execution-aware construction, synthetic ETF/roll. |
| Portfolio Construction and Risk | Optimization, clustering allocation, and risk overlays. |
| Market Microstructure, Dependence and Regime Detection | Microstructure estimators, dependence metrics, structural breaks. |
Constructs standard/time/run/imbalance bars from trade streams.
CUSUM and z-score event filters for event-driven sampling.
Triple-barrier event labeling and metadata generation.
Sample weighting utilities for overlapping event structure.
Purged cross-validation utilities designed for label overlap and leakage control.
Moment-based mixture fitting utilities for two-normal components.
Feature ranking methods: MDI, MDA, and single-feature importance with PCA diagnostics.
Model fingerprinting for linear, non-linear, and pairwise feature effects.
Indicator matrix and sequential bootstrap tooling.
Sequentially bootstrapped bagging classifiers/regressors.
Transforms model confidence and constraints into executable position sizes.
Synthetic ETF and futures roll utilities for realistic PnL path construction.
Performance diagnostics for strategy returns and position trajectories.
Critical Line Algorithm implementation for constrained mean-variance optimization.
Hierarchical Clustering Asset Allocation variant with cluster-level constraints.
Hierarchical Risk Parity allocation with recursive bisection.
Optimal Number of Clusters utilities for clustering stability and allocation workflows.
Mean-variance and constrained allocation methods with ergonomic APIs.
Portfolio and return-distribution risk measures for downside control.
Dependence metrics beyond linear correlation for feature and asset relationships.
Fractional differentiation to improve stationarity while retaining memory.
Price-impact, spread, entropy, and flow toxicity estimators.
Regime change and bubble diagnostics (Chow, CUSUM variants, SADF).
Fast EWMA primitive shared across feature and volatility routines.
Volatility estimators used across labeling and risk workflows.