Concept

Pareto Distribution

The SOA Pareto is a two-parameter heavy-tail severity model with shape α and scale θ. Its mean exists only for α > 1 and its variance only for α > 2, which is exactly the tail-weight signal that makes Pareto useful for excess-of-loss layers and reinsurance pricing.

Page Contract
Role
Concept
Level
Core
Time
Reference
Freshness
Stable
Search Intent
Pareto distribution

Definition And Parameterization

The Pareto distribution on the SOA Loss Models tables has shape α > 0 and scale θ > 0. The support is x > 0, not x > θ. This is the Pareto Type II (Lomax) form, not the Pareto Type I that some textbooks introduce. Confusing the two changes both the support and the moment formulas.

Below x = 0 the PDF is zero. The density is maximized at x = 0 and decays as a power of (x + θ) — that polynomial decay is what makes the tail heavy.

PDF
f(x)=αθα(x+θ)α+1,x>0f(x)=\frac{\alpha\,\theta^{\alpha}}{(x+\theta)^{\alpha+1}},\quad x>0
Survival function
S(x)=(θx+θ)αS(x)=\left(\frac{\theta}{x+\theta}\right)^{\alpha}

Moments: Tail-Weight Signals

The Pareto mean exists only when α > 1; the variance only when α > 2; the k-th moment only when α > k. For α close to 1 the mean is large; for α between 1 and 2 the mean exists but the variance does not. This is the actuarial signature of heavy tails: large losses are not rare enough for higher moments to converge.

On excess-of-loss reinsurance pricing, fitted α values below 2 are common for catastrophe layers, which is why standard parametric variance-based capital formulas cannot be used in those layers without modification.

Mean (for α > 1)
E[X]=θα1E[X]=\frac{\theta}{\alpha-1}
Variance (for α > 2)
Var(X)=αθ2(α1)2(α2)\operatorname{Var}(X)=\frac{\alpha\,\theta^{2}}{(\alpha-1)^{2}(\alpha-2)}

Memoryless-Past-Threshold Property

Pareto has a useful conditional distribution. Given that X exceeds a threshold d, the excess X − d is again Pareto with the same shape α but with a shifted scale θ + d. This is why Pareto is the natural model for excess-of-loss reinsurance: data above any retention is still Pareto.

Conditional excess distribution
XdX>d    Pareto(α,  θ+d)X-d\mid X>d\;\sim\;\mathrm{Pareto}(\alpha,\;\theta+d)

Exponential-Mixed-Exponential Identity

If X given a rate Λ is exponential with rate Λ, and Λ itself is gamma-distributed with shape α and rate θ (equivalently, an inverse-scale parameter), then the unconditional X is Pareto with shape α and scale θ. This is the continuous analogue of the gamma-mixed-Poisson identity that produces the negative binomial.

The narrative is the same: heterogeneity in an underlying exponential rate, across policies or across loss types, produces an unconditional heavy-tailed distribution even though every conditional component is light-tailed. Klugman Loss Models presents this in the mixture-distributions chapter.

Exponential mixed by gamma
XΛExp(Λ),  ΛGamma(α,  rate=θ)    XPareto(α,θ)X\mid \Lambda\sim \mathrm{Exp}(\Lambda),\;\Lambda\sim \mathrm{Gamma}(\alpha,\;\text{rate}=\theta)\;\Rightarrow\; X\sim \mathrm{Pareto}(\alpha,\theta)

Maximum Likelihood Estimation

With scale θ known, the Pareto shape MLE has a clean closed form. Given observations x_1, ..., x_n, the MLE is the inverse of the average log-ratio:

With both α and θ unknown, the likelihood equations are coupled and have no closed-form solution. Numerical methods are used in practice. ASTAM examples often hold θ fixed at a small value (or at a deductible) and ask for α̂.

Shape MLE with known scale
α^=ni=1nln ⁣((xi+θ)/θ)\hat\alpha=\frac{n}{\sum_{i=1}^{n}\ln\!\bigl((x_i+\theta)/\theta\bigr)}

Worked Example: Tail Probability And Layer Pricing

Severities are Pareto with α = 3 and θ = 1,000. The probability a loss exceeds 5,000 is (1,000 / 6,000)^3 ≈ 0.00463. The mean loss is 1,000 / 2 = 500.

The expected excess over a 5,000 retention, given that the loss exceeds 5,000, is the mean of the conditional Pareto with shape 3 and scale 6,000, which is 6,000 / 2 = 3,000. The unconditional pure premium for the excess layer is 0.00463 × 3,000 ≈ 13.9.

Worked Example: MLE From Five Losses

Observed losses with known scale θ = 500: 800, 1,500, 2,000, 4,000, 6,000. Log-ratios ln((x_i + 500)/500) are ln(2.6) = 0.956, ln(4) = 1.386, ln(5) = 1.609, ln(9) = 2.197, ln(13) = 2.565. Sum is 8.713.

MLE for shape is α̂ = 5 / 8.713 ≈ 0.574. With α̂ < 1, the fitted Pareto has infinite mean. The fit is signaling that this small sample is consistent with a very heavy tail; a larger sample, a goodness-of-fit check, and an alternative model (Lognormal, Weibull) should all be considered before any premium decision.

References And Official Sources