Concept

Law of Large Numbers

The law of large numbers says the sample mean of many iid observations settles at the population mean. It is the formal reason credibility theory works, why empirical claim-rate estimates make sense, and why CLT statements about averages have something to converge to in the first place.

Page Contract

Role: Concept
Level: Core
Time: Reference
Freshness: Stable

Search Intent

Plain-English Definition

If you observe many independent observations from the same distribution, their average gets close to the true mean as the sample size grows. That single statement is the law of large numbers.

There are two precise versions. The weak law says the probability of being far from the mean drops to zero as n grows. The strong law says the sample mean actually converges to the mean along almost every sample path, not just in probability. For independent identically distributed observations with a finite mean, both versions hold; the strong law is what justifies long-run frequency statements.

Sample mean

\bar X_n = \frac{1}{n}\sum_{i=1}^{n} X_i

Weak law of large numbers

\Pr\!\left[\,|\bar X_n - \mu| > \varepsilon\,\right] \to 0 \text{ as } n \to \infty

Strong law of large numbers

\Pr\!\left[\,\lim_{n\to\infty} \bar X_n = \mu\,\right] = 1

Worked Example

Suppose a portfolio of identical policies generates iid claim counts with mean lambda = 0.12 per policy-year. With 25 policies we expect 3 claims per year on average, but a single year might show 1 claim or 7. With 2500 policy-years the law of large numbers says the observed average claim count per policy-year will be very close to 0.12.

Numerically, the standard deviation of the sample mean is sigma over square-root of n. If individual policy variance is 0.12, the sample-mean standard deviation at n = 2500 is square-root of (0.12 / 2500) which is about 0.0069. The sample mean is almost always within 0.014 of 0.12. This is the precision a credibility analysis relies on.

Standard deviation of the sample mean

\operatorname{SD}(\bar X_n) = \sigma/\sqrt{n}

Why Actuaries Use It

Credibility theory rests on the law of large numbers. The reason a portfolio's observed loss ratio carries weight in setting next-year rates is that with enough exposure, the sample loss ratio is close to the true loss ratio.

Reserving, experience rating, and pricing all assume that a sample average of past outcomes is a defensible estimate of the underlying expectation. That assumption is exactly the law of large numbers in action.

When It Can Fail

The law of large numbers requires the mean to be finite. Heavy-tailed loss distributions can have infinite mean (Pareto with shape parameter alpha less than or equal to 1), and in that case the sample average does not converge to anything. Even with finite mean but infinite variance (Pareto with alpha between 1 and 2), convergence still happens but at a slower-than-CLT rate, and the standard square-root-n confidence interval understates the uncertainty.

This matters for catastrophe reinsurance and operational-risk capital, where Pareto-tailed loss models are routine. Standard sample-mean intuition silently breaks. In those regimes a median or a trimmed mean is the better location estimator.

Common Mistakes

The law of large numbers is about averages, not individual outcomes. Saying that 'after many policies the next claim will be near the mean' is wrong; the next claim is still drawn from the original distribution. What gets close to the mean is the running average across many policies.

Another mistake is confusing the law of large numbers with the central limit theorem. The LLN tells you the average converges. The CLT tells you the shape of the fluctuations around that limit. Both are needed for confidence intervals to make sense, but they answer different questions.

Statistics Connection

The law of large numbers is the formal reason every consistent estimator works. Sample means, sample variances, empirical loss ratios, fitted Poisson rates, and maximum-likelihood estimators all converge to their population counterparts because of an LLN-type argument applied to functions of the observations.