Cointegration and Pairs Trading Statistics
Cointegration tests whether a linear combination of two or more non-stationary price series is itself stationary, producing a mean-reverting spread suitable for relative-value trading. Unlike correlation, which measures co-movement of returns, cointegration measures whether prices share a long-run equilibrium — the statistical foundation of every classical pairs trading strategy.
The Engle-Granger Two-Step Procedure
The standard test regresses one price series on another, then tests the residuals for stationarity using an Augmented Dickey-Fuller (ADF) test. The residual series is the candidate spread.
The hedge ratio β is the OLS slope. The ADF test then evaluates whether the spread reverts to its mean by fitting:
Under the null hypothesis γ = 0 (the spread has a unit root and is non-stationary). Rejecting the null at a chosen significance level provides evidence of cointegration. The Johansen test generalizes this to multivariate systems and avoids the asymmetry of choosing which asset to regress on which.
Interpreting the Test Statistics
The ADF test produces a t-statistic compared against critical values that depend on sample size and whether a constant or trend is included. For a typical pairs trading window of 252 daily observations, critical values are approximately −3.43 (1%), −2.86 (5%), and −2.57 (10%). More negative statistics indicate stronger evidence of stationarity.
A p-value below 0.05 is the conventional threshold for declaring a pair cointegrated. Practitioners often demand p < 0.01 because pairs trading involves multiple-testing bias — scanning hundreds of pairs guarantees spurious rejections at the 5% level. The half-life of mean reversion, computed from the AR(1) coefficient of the spread, should typically fall between 1 and 30 trading days for the relationship to be tradable after costs.
What Cointegration Does Not Capture
Cointegration is a statistical property of historical prices, not an economic guarantee. Two series can test as cointegrated purely by chance, especially when scanning large universes — this is the multiple comparisons problem, and it is severe. A universe of 500 assets produces roughly 125,000 pairs; at p < 0.05 you expect 6,250 false positives before considering any genuine relationship.
The hedge ratio β is estimated, not known. It drifts over time as fundamentals change, corporate actions occur, and market regimes shift. A pair cointegrated in-sample frequently fails out-of-sample because the equilibrium relationship was either spurious or has decayed.
The test also assumes linearity and constant parameters. Real spreads exhibit regime changes, volatility clustering, and asymmetric reversion — none of which are captured by Engle-Granger or Johansen. Cointegration is necessary for classical pairs trading; it is not sufficient.
Use in Kestrel Signal
Kestrel Signal reports cointegration diagnostics for every candidate pair in the research workspace: ADF statistic, p-value, estimated β, spread half-life, and Hurst exponent of the residual series. Pairs are flagged when the half-life exceeds the configured backtest horizon or when β instability — measured by rolling-window standard deviation of the hedge ratio — crosses a user-defined threshold.
Backtests on cointegrated pairs include walk-forward re-estimation of β by default, so reported equity curves reflect the realistic cost of hedge ratio drift rather than a single in-sample fit. Multiple-testing correction via Bonferroni or Benjamini-Hochberg is available when scanning universes, with the corrected p-value displayed alongside the raw statistic.