Statistics17 May 2026 · 5 min read

Why Mean-Reversion Sharpe Ratios Are Almost Always Overstated

Negative autocorrelation and bid-ask bounce systematically inflate the naive Sharpe ratio of mean-reversion strategies, often by a factor of two.

The Sharpe ratio of a mean-reversion strategy, computed naively from daily returns, almost always overstates the true risk-adjusted performance — often by a factor of two or more. The mechanism is structural, not a bug in any particular backtest: mean-reversion strategies generate return streams with strong negative autocorrelation, and the standard Sharpe estimator assumes returns are i.i.d. When you violate that assumption in the specific direction mean-reversion produces, the denominator collapses faster than the numerator. The result is a number that looks like skill but mostly reflects the geometry of the return process itself.

The i.i.d. assumption is doing more work than you think

The annualized Sharpe ratio you see quoted everywhere is computed as the daily Sharpe scaled by sqrt(252). That scaling factor is exact only when daily returns are independent and identically distributed. Under serial correlation, the correct annualization uses a different multiplier — one that depends on the autocorrelation structure of the return series.

For a return series with first-order autocorrelation rho, the variance of the k-period sum grows faster or slower than k depending on the sign of rho. Mean-reversion strategies, by construction, produce negative rho at short lags: a winning day is statistically followed by a flat or losing day as the position unwinds. This shrinks the long-horizon variance, which inflates the annualized Sharpe when computed with the naive sqrt-time rule.

The Lo correction

Andrew Lo's 2002 paper formalized the correction. For an arbitrary autocorrelation structure, the correct scaling factor for converting a q-period Sharpe to a longer horizon is:

SR(q) = SR(1) · q / sqrt(q + 2 · Σ (q − k) · rho_k)  for k = 1 to q−1

When all rho_k are zero, this collapses to the familiar sqrt(q). When rho_1 is meaningfully negative — say −0.2, which is unremarkable for a mean-reversion book — the denominator shrinks, and the corrected Sharpe drops substantially relative to the naive calculation. A naive annualized Sharpe of 2.0 can correspond to a true annualized Sharpe closer to 1.3 once the autocorrelation is properly accounted for.

If your backtest reports an annualized Sharpe above 2 on a short-holding-period mean-reversion strategy and you have not applied the Lo correction, assume the real number is 30–50% lower before doing any capital allocation math.

Bid-ask bounce makes it worse

The autocorrelation problem is compounded by execution assumptions. Mean-reversion entries typically buy on weakness and sell on strength — by construction, they trade against the prevailing micro-move. If your backtest fills at the mid or, worse, at the close print, you are systematically capturing the bid-ask bounce as alpha.

The bounce is a microstructure artifact: closing prints alternate between bid and ask trades, generating negative serial correlation in close-to-close returns that has nothing to do with any tradeable inefficiency. A naive simulator will buy at the bid print and mark the position at the next ask print, recording a return that cannot be realized by any executable order. Across thousands of trades, this produces a smooth, high-Sharpe equity curve that disintegrates the moment real fills are introduced.

Why the standard error of Sharpe is also wrong

Even setting aside the point estimate, the confidence interval around a Sharpe ratio is misleading under autocorrelation. The asymptotic standard error of the Sharpe estimator under i.i.d. returns is approximately:

SE(SR) ≈ sqrt((1 + 0.5 · SR²) / T)

Under negative autocorrelation, the effective sample size is smaller than T, because consecutive observations carry overlapping information. Your t-statistic on "is this Sharpe different from zero" is inflated; your confidence bands are too tight. A strategy that appears statistically significant at the 1% level may sit comfortably inside a 20% band once the dependence structure is modeled honestly.

The intuition: when returns mean-revert, each observation tells you less than an independent one would. Effective sample size shrinks, and every statistic that depends on T silently degrades.

What to do instead

First, always report the autocorrelation of your strategy's daily return series alongside the Sharpe. If rho_1 is more negative than −0.05, you owe yourself a Lo-corrected number. Second, run the backtest at multiple holding-period horizons — weekly and monthly Sharpe ratios are less affected by short-lag autocorrelation and bid-ask bounce, and the gap between daily and monthly Sharpe is itself a diagnostic.

Third, model fills explicitly. If you fill at the close, assume you are paying half the spread at minimum; if you fill on the open following a signal, you are giving up overnight gap exposure that often dominates the edge. In Kestrel Signal, fill models are first-class objects precisely because the difference between mid-fill and conservative-fill Sharpe for a mean-reversion book is rarely cosmetic — it is often the entire result.

The deeper point is methodological. Sharpe is a summary statistic built on assumptions that mean-reversion strategies violate by design. Treat any uncorrected Sharpe on such a strategy as an upper bound, not an estimate.