Walk-Forward Analysis
Walk-forward analysis is the closest thing systematic trading has to a held-out test set. You train on a window of historical data, test on the period immediately after, then roll the window forward and repeat. The distribution of out-of-sample performance tells you whether your strategy generalises.
The core idea
A backtest that fits perfectly to its full history tells you almost nothing about whether the strategy will work in the future. The parameters may have been chosen — even unconsciously — to exploit patterns that no longer exist. Walk-forward testing separates the optimization period (in-sample) from the evaluation period (out-of-sample) in a systematic way.
The simplest version: divide your data into a training window and a testing window. Optimise parameters on the training data, evaluate fixed parameters on the test data. The out-of-sample Sharpe is a less biased estimate of live performance than the in-sample Sharpe.
Rolling vs anchor modes
Kestrel Signal supports two window modes:
Rolling: Both the in-sample start and end advance together. Each window covers the same number of bars. This is appropriate when you believe the market regime changes over time and older data is less relevant.
Anchor: The in-sample window always starts from the beginning of data, but the out-of-sample window advances. Useful when the strategy uses long-term levels or when you want to maximise the amount of training data. The drawback: later training windows are larger, which can bias the optimisation.
Walk-Forward Efficiency
Walk-Forward Efficiency (WFE) is the ratio of out-of-sample to in-sample Sharpe ratio, clamped to [−1, 2] to avoid outlier distortion from near-zero IS denominators.
A WFE close to 1 means the strategy performs about as well out-of-sample as in-sample — the parameters generalise. A WFE below 0.5 suggests meaningful overfitting: the strategy captures in-sample patterns that don't persist. A negative WFE means the strategy actually loses money out-of-sample on average.
How many windows?
Walk-forward analysis is most informative when you have enough windows to form a distribution. Three windows gives you almost no statistical power. Ten or more windows begins to be meaningful. With 5 years of daily data and 252-bar IS / 63-bar OOS windows, you get approximately 15 windows — a reasonable sample.
Too-short OOS windows create noise; too-long OOS windows waste data that could be IS. The conventional ratio is IS:OOS of 3:1 to 5:1 for daily strategies.
What walk-forward cannot do
Walk-forward analysis assumes that strategy parameters are re-optimised within each IS window before applying them to the OOS window. If you're using fixed parameters across all windows (which Kestrel Signal's current implementation does), you're testing parameter robustness, not adaptive optimisation. This is a stricter test in some ways — parameters that survive fixed across changing regimes are more trustworthy than those that are repeatedly refitted.
Walk-forward also doesn't solve the fundamental problem of multiple testing across strategies. For that, use CPCV with the Probability of Backtest Overfitting.