Concept

Kolmogorov-Smirnov and Anderson-Darling Tests

Kolmogorov-Smirnov measures the largest gap between empirical and fitted CDFs. Anderson-Darling weights the gap by an inverse-variance factor that emphasizes the tails. The two tests can reach different conclusions on the same data, and for actuarial severity fits that disagreement is informative.

Empirical CDF As The Common Ground

Sort the data x_(1) ≤ ... ≤ x_(n) and define the empirical CDF F_n(x) as the fraction of observations no larger than x. The empirical CDF is a step function with jumps of 1/n at each ordered observation.

All three tests below — Kolmogorov-Smirnov, Cramér-von Mises, and Anderson-Darling — measure the difference between F_n and a fitted CDF F. They differ in how the gap is summarized.

Kolmogorov-Smirnov Statistic

K-S takes the maximum vertical gap between empirical and fitted CDF. Because the empirical CDF jumps at observations, the supremum is attained either just before or just after a jump, so K-S in practice is the maximum over the 2n gap values.

K-S is intuitive and easy to compute. It is also relatively insensitive to the tails because the empirical CDF is near 0 and 1 at the extremes, so vertical gaps there are bounded by neighboring step values. For actuarial severity fits where tail behavior dominates the answer, this insensitivity is a real limitation.

Kolmogorov-Smirnov statistic

D_n=\sup_{x}\bigl|F_n(x)-F(x)\bigr|

Anderson-Darling Statistic

Anderson-Darling integrates the squared CDF gap and weights it by an inverse-variance factor that grows large near the tails of F. The weighting is what makes A-D more powerful than K-S against tail misspecification.

Klugman Loss Models computes A-D from the order statistics with the closed-form expression below. The ordering must match the SOA convention; reversing it changes the sign of one of the log terms.

Anderson-Darling statistic

A^{2}=-n-\frac{1}{n}\sum_{i=1}^{n}(2i-1)\Bigl[\ln F(x_{(i)})+\ln\bigl(1-F(x_{(n+1-i)})\bigr)\Bigr]

Cramér-von Mises As Middle Ground

Cramér-von Mises integrates the squared CDF gap without the inverse-variance weighting, which gives it body sensitivity comparable to A-D but tail sensitivity comparable to K-S. CvM is less commonly reported on actuarial exams but appears in Klugman as a benchmark.

ASTAM treats CvM as a minor sibling of A-D; understand A-D first.

Parameter-Estimation Caveat

All three tests have tabulated critical values that assume F is fully specified. When parameters of F are estimated from the same data being tested, the test statistic shrinks (the fit is closer than chance would predict) and the published critical values become too conservative.

Practitioners use bootstrap or simulation to obtain corrected critical values for fitted distributions. Some tables (Stephens 1974, reproduced in Klugman) provide adjusted critical values for specific distribution-parameter combinations. ASTAM problems usually state which adjustment to apply.

Worked Example: K-S On A Lognormal Fit

Five claim amounts: 250, 600, 900, 1,800, 4,500. Sorted F_n values at the five points are 0.2, 0.4, 0.6, 0.8, 1.0; the value just before each jump is 0.0, 0.2, 0.4, 0.6, 0.8.

A fitted lognormal gives F(x_(i)) values of 0.18, 0.42, 0.55, 0.79, 0.96. Vertical gaps at each observation: just-before values give |0 − 0.18| = 0.18, |0.2 − 0.42| = 0.22, |0.4 − 0.55| = 0.15, |0.6 − 0.79| = 0.19, |0.8 − 0.96| = 0.16. Just-after values give the symmetric set of differences from F_n at the jump. The maximum across all of them is 0.22. So D_5 = 0.22.

The 95th percentile of D_5 for fully specified F (no parameter estimation) is about 0.563 from standard K-S tables, so the lognormal fit is not rejected at α = 0.05 — but with two parameters estimated, the effective critical value is smaller and a simulation-based check should be run before concluding adequacy.

When To Prefer Which Test

Chi-squared: data are grouped or naturally categorical; expected cell counts are reasonable.

Kolmogorov-Smirnov: continuous un-binned data; primary interest is body fit; small samples where tail differences are unstable.

Anderson-Darling: continuous un-binned data; primary interest is tail fit; loss-modeling settings where capital and reinsurance pricing depend on the upper tail. ASTAM-style severity fits should default to A-D when the syllabus permits.