Concept

Lognormal Distribution

The lognormal distribution is the distribution of exp(Y) when Y is normal. It is the default heavy-tailed severity model on FAM and ASTAM and inherits closed-form MLEs from the normal distribution by taking logs of the data.

Page Contract

Role: Concept
Level: Core
Time: Reference
Freshness: Stable

Search Intent

lognormal distribution

Definition And Parameterization

A positive random variable X is lognormal with parameters μ and σ if ln X is normally distributed with mean μ and standard deviation σ. The parameters live on the log scale; they are not the mean and standard deviation of X.

This is the convention in the SOA Loss Models tables. Always state μ and σ on the log scale, then derive moments of X separately. Reporting the lognormal mean as μ is the single most common error on this distribution.

PDF

f(x)=\frac{1}{x\sigma\sqrt{2\pi}}\exp\!\left(-\frac{(\ln x-\mu)^{2}}{2\sigma^{2}}\right),\quad x>0

Mean and variance of X

E[X]=e^{\mu+\sigma^{2}/2},\qquad \operatorname{Var}(X)=e^{2\mu+\sigma^{2}}\bigl(e^{\sigma^{2}}-1\bigr)

Coefficient of variation

\mathrm{CV}(X)=\sqrt{e^{\sigma^{2}}-1}

Why Lognormal As Severity

Lognormal severities are right-skewed and heavy-tailed compared with gamma. The CV depends only on σ and grows without bound, so the model can accommodate severities where a few claims dwarf the rest. It also has a closed-form mean and variance, which matters for capital and pricing computations.

Lognormal does not have a moment generating function in the usual sense; all moments exist but the MGF integral diverges. This is a strong signal that the tail is heavier than any model with a clean MGF, like gamma.

Limited Expected Value

For policy modifications with a per-loss limit u, the limited expected value E[X ∧ u] is closed-form for lognormal and is one of the heavily tested quantities on FAM and ASTAM. Klugman Loss Models presents the formula using the standard normal CDF.

Limited expected value at limit u

E[X\wedge u]=e^{\mu+\sigma^{2}/2}\,\Phi\!\left(\frac{\ln u-\mu-\sigma^{2}}{\sigma}\right)+u\Bigl[1-\Phi\!\left(\tfrac{\ln u-\mu}{\sigma}\right)\Bigr]

Maximum Likelihood Estimation

Because ln X is normal, the lognormal MLE reduces to the normal MLE applied to log-data. Let Y_i = ln X_i. Then μ̂ is the sample mean of the Y_i and σ̂^2 is the population (divide by n) variance of the Y_i.

Two cautions. First, this is the MLE for μ and σ on the log scale, not for E[X] on the original scale. If a problem asks for an estimate of E[X], compute exp(μ̂ + σ̂^2 / 2). Second, σ̂^2 with the MLE divisor n is biased; the unbiased estimator divides by n − 1 instead and is sometimes preferred outside MLE settings.

Lognormal MLE

\hat\mu=\tfrac{1}{n}\sum_{i=1}^{n}\ln x_i,\qquad \hat\sigma^{2}=\tfrac{1}{n}\sum_{i=1}^{n}(\ln x_i-\hat\mu)^{2}

Worked Example: Probability And Quantile

Claim sizes are lognormal with μ = 7 and σ = 1.2 on the log scale. The probability a claim exceeds 5,000 is 1 − Φ((ln 5000 − 7) / 1.2) = 1 − Φ((8.517 − 7) / 1.2) = 1 − Φ(1.264) ≈ 0.103.

The 95th percentile is exp(μ + 1.645 σ) = exp(7 + 1.974) = exp(8.974) ≈ 7,910. The expected claim size is exp(7 + 0.72) = exp(7.72) ≈ 2,243 — note this is much smaller than the 95th percentile because of the right skew.

Worked Example: MLE From Six Claims

Observed claims: 1,000, 2,500, 4,000, 800, 6,000, 1,800. Log-transform: 6.908, 7.824, 8.294, 6.685, 8.700, 7.496. Sample mean of logs is 7.651, so μ̂ = 7.651. Sum of squared deviations of the logs from the mean is 3.053; dividing by n = 6 gives σ̂² ≈ 0.509, so σ̂ ≈ 0.713.

Estimated mean claim is exp(7.651 + 0.509 / 2) = exp(7.906) ≈ 2,716. Estimated CV is √(exp(0.509) − 1) ≈ 0.815. Both quantities are functions of the fitted log-scale parameters, not direct sample statistics of the original data.

Lognormal Versus Gamma

Lognormal has CV = √(exp(σ^2) − 1), which is unbounded above. Gamma has CV = 1/√α, which is bounded above by infinity only as α approaches zero, and in practice α is rarely so small that gamma can mimic the heaviest lognormal tails.

If empirical claim data show a heavier upper tail than the lognormal would predict, Pareto is the next default; see /concepts/pareto-distribution/. Goodness-of-fit choice between lognormal, gamma, Weibull, and Pareto is the recurring exam pattern at ASTAM level and is the reason /concepts/kolmogorov-smirnov-anderson-darling/ and /concepts/model-selection-lrt-aic-bic/ exist.