Methodology5 min read

Parameter Sensitivity Analysis

Parameter sensitivity analysis measures how a strategy's performance metrics respond to perturbations in its tunable inputs — lookback windows, thresholds, stop distances, and similar values. A strategy whose Sharpe collapses when a moving average length shifts from 20 to 22 is not robust; it is curve-fit. Sensitivity analysis is the diagnostic that separates the two.

The computation

For a single parameter, sensitivity is the local derivative of a performance metric M with respect to the parameter θ, estimated discretely across the tested grid:

S(θ) = (M(θ + Δθ) - M(θ - Δθ)) / (2 · Δθ)

In practice, you compute M across the full parameter grid and summarize the result distributionally rather than as a single derivative. The standard summary is the coefficient of variation of the metric across the neighborhood of the chosen operating point:

CV_local = std(M over neighborhood) / |mean(M over neighborhood)|

For multi-parameter strategies, the same logic extends to a grid in N dimensions. The neighborhood becomes a hypercube around the operating point, and you can compute partial sensitivities per axis or a single aggregate stability score across the full local region.

Interpretation

Lower CV_local means a flatter performance surface around the chosen parameters — small specification errors translate to small metric changes. As a rough guideline for Sharpe ratio sensitivity on daily strategies: CV_local below 0.15 indicates a robust plateau, 0.15 to 0.35 indicates moderate sensitivity worth investigating, and above 0.35 indicates the result is likely a local spike rather than a genuine edge.

The geometry of the surface matters as much as the scalar. A strategy sitting on a wide, gently sloped plateau is structurally different from one sitting on a narrow ridge, even if both have identical CV_local at the chosen point. Visual inspection of the heatmap — looking for connected high-performance regions versus isolated islands — is non-negotiable.

A useful heuristic: if the best parameter set is also a local maximum surrounded by sharply worse neighbors, treat the result as overfit until proven otherwise. Real edges produce broad regions of acceptable performance, not single bright pixels.

What it does not capture

Sensitivity analysis is silent on regime stability. A strategy can have a beautifully flat parameter surface within the in-sample window and still fail catastrophically out-of-sample if the underlying market regime shifts. Flatness across θ is not the same as stability across time.

It also does not address selection bias from the grid itself. If you tested 400 parameter combinations and reported the most stable region, you have still performed a search over 400 hypotheses. Sensitivity analysis describes the local shape of the surface you found; it does not correct for the multiple-comparisons problem that produced it.

A flat region in a grid search can itself be a curve-fit artifact when the parameter axes are correlated with the same noise feature in the price series. Confirm robustness with walk-forward analysis on disjoint windows, not parameter perturbation alone.

Finally, sensitivity does not capture transaction-cost cliffs. A strategy may be flat in Sharpe across the threshold parameter at zero costs and steeply sloped once realistic slippage is applied, because the trade count itself depends sharply on the threshold. Always run sensitivity at the cost assumptions you intend to deploy under.

In Kestrel Signal

Kestrel Signal computes parameter sensitivity automatically on every grid backtest, reporting CV_local for each tracked metric and rendering the full performance surface as a heatmap with the operating point marked. Adjacent cells are summarized in a stability panel that flags isolated peaks and highlights connected plateaus. The same surface is recomputed on the out-of-sample segment so you can compare in-sample and out-of-sample geometry side by side — the most direct visual check on whether a parameter region survived the holdout.