Metrics4 min read

Volatility: what it measures and what it does not

Volatility quantifies the dispersion of returns around their mean — it is a measure of variability, not risk, not loss, not drawdown. It is the most-used and most-abused number in quantitative finance because it compresses an entire distribution into a single scalar, discarding asymmetry, tail weight, and path dependence in the process.

The standard estimator is the sample standard deviation of periodic returns, annualized by the square root of the number of periods per year. For daily returns on equities, the convention is 252 trading days; for crypto or 24/7 markets, 365.

σ_annual = sqrt(1/(N-1) · Σ(r_i - r_mean)²) · sqrt(P)

Here r_i is the periodic return, r_mean is the sample mean of those returns, N is the number of observations, and P is the number of periods per year. The sqrt(P) scaling assumes returns are independent and identically distributed — an assumption that fails empirically but persists by convention.

Interpreting the number

Annualized volatility translates to a rough one-sigma band around expected return. A strategy with 20% annualized volatility and 10% expected return implies roughly a 68% probability that next year's return falls between -10% and +30%, assuming normality. The assumption rarely holds, but the band remains a useful first-order intuition.

Typical ranges by asset class: investment-grade bonds run 3-6%, broad equity indices 12-20%, single equities 25-60%, commodities 15-40%, major FX pairs 6-12%, and crypto 60-120%. Strategy-level volatility depends on leverage and concentration; a market-neutral equity book might target 6-10%, a trend-following CTA 12-20%, a levered intraday strategy 30%+.

Lower is not automatically better. A strategy with 4% volatility and 3% return is worse on a risk-adjusted basis than one with 20% volatility and 18% return, and the latter is easier to lever down than the former is to lever up. The relevant question is volatility per unit of return, conditioned on the return being real.

What volatility does not capture

Volatility is symmetric. It treats a +5% day and a -5% day identically, which means a strategy that grinds out small gains and occasionally loses 30% in a session can have the same volatility as one with stable, balanced moves. Skewness and kurtosis are required to distinguish them.

Volatility is also stationary in the estimator — a single number for the whole sample. Realized volatility clusters: calm regimes are followed by calm regimes, shocks by shocks. A full-sample sigma of 15% can mask periods of 8% and 40%, which matters enormously for position sizing and margin.

Volatility says nothing about drawdown. Two strategies with identical sigma can have maximum drawdowns differing by an order of magnitude, because drawdown depends on the autocorrelation of losses, not their dispersion. Never size a position from volatility alone.

It does not capture tail risk. Empirical return distributions have kurtosis far above the normal value of 3, meaning extreme moves occur orders of magnitude more often than a Gaussian model predicts. Volatility-based VaR systematically underestimates the probability of catastrophic days, which is why 2008, March 2020, and every flash crash were "25-sigma events" by the lights of the people sizing them.

Finally, volatility says nothing about the source of variation. A strategy whose volatility comes from a persistent factor exposure (e.g., short vol, carry, momentum) carries qualitatively different risk than one whose volatility comes from idiosyncratic noise, even if the scalar is identical.

In Kestrel Signal

Backtest reports in Kestrel Signal surface annualized volatility alongside downside deviation, skew, excess kurtosis, and rolling 60-day realized volatility. The rolling series is included specifically to make regime shifts visible rather than averaged away. Treat the scalar as a summary, not a sufficient statistic — the supporting distributional metrics exist because volatility on its own is not enough to size or evaluate a strategy.