CAS path

MAS-II Statistical Learning

Statistical learning is the largest MAS-II domain. It tests validation, model tuning, predictive performance, trees, clustering, PCA, regularization, and interpretation.

Credential side

CAS

Primary intent

MAS-II statistical learning

Best next page

CAS PCPA Guide

Official Source Map

CAS Exam MAS-II

Current MAS-II page and 2026 outline are mapped for format, domain weights, assumed knowledge, tables, and the reading list that drives credibility, mixed models, statistical learning, and time series preparation.

source map reviewed

Last verified 2026-05-141 official source filesNo raw exam or textbook text published

Exam facts

What the official PDFs establish

Appointment length: 4.5-hour appointment with a 4-hour exam duration.
Scheduled break: The appointment includes a scheduled 15-minute break plus tutorial/confidentiality/survey time.
Assumed knowledge: Calculus, probability, linear algebra concepts at the regression-prerequisite level, and mastery of MAS-I concepts are assumed.

Weights

Topic and domain coverage

Topic	Weight	Source
Introduction to Credibility	15-25%	Source: MAS-II Content Outline 2026, p. 2
Linear Mixed Models	10-20%	Source: MAS-II Content Outline 2026, p. 2
Statistical Learning	40-50%	Source: MAS-II Content Outline 2026, p. 2
Time Series with Constant Variance	15-25%	Source: MAS-II Content Outline 2026, p. 2
Cognitive level: Remember	5-10%	Source: MAS-II Content Outline 2026, p. 1
Cognitive level: Understand and Apply	55-60%	Source: MAS-II Content Outline 2026, p. 1
Cognitive level: Analyze and Evaluate	35-40%	Source: MAS-II Content Outline 2026, p. 1
Cognitive level: Create	0-5%	Source: MAS-II Content Outline 2026, p. 1

Readings

Chapter and reading intelligence

Tse
Credibility work is assigned from Nonlife Actuarial Models, covering classical, Buhlmann, Buhlmann-Straub, and Bayesian credibility sections in chapters 6-9.
Source: MAS-II Content Outline 2026, p. 4
West
Linear Mixed Models: A Practical Guide Using Statistical Software is assigned across all chapters, excluding coding examples, with shrinkage notes called out separately.
Source: MAS-II Content Outline 2026, p. 4
James et al., Salis, and GLM Monograph
Statistical learning is anchored to ISLR chapters 2.2, 4.4.2, 8, 10, and 12, Salis chapters 3 and 10, and Chapter 7 of Generalized Linear Models for Insurance Rating.
Source: MAS-II Content Outline 2026, p. 4
Cowpertwait and Metcalfe
Time series preparation uses Introductory Time Series with R chapters 1-5 excluding selected sections, plus chapter 6 and sections 7.1-7.3.
Source: MAS-II Content Outline 2026, p. 4

Materials

Official files used by the map

CAS content outlinecontent-outline
Primary source for domain weights, exam format, assumed knowledge, and official reading assignments.
Source: MAS-II Content Outline 2026

Source note: some study materials are private references. ActuaryPath links official sources and uses original explanations instead of republishing paid or copyrighted materials.

Quick Answer

The current MAS-II outline gives statistical learning a 40-50% weight, making it the largest domain. The reading map includes ISLR topics plus insurance-rating performance measures and GLM evaluation language.

This domain should be studied as model-choice discipline, not as a list of algorithm names. The exam cares about why a model is chosen, how it is tuned, how it is validated, and what the output means for a property-casualty problem.

Validation And Tuning

Training error is not enough. MAS-II candidates should know training, validation, test, and cross-validation roles, and why tuning parameters should be selected outside the final test evaluation.

Overfitting is the common failure mode: the model captures sample noise and looks better in-sample than it will on new data.

Regularization

Regularization controls model complexity by penalizing coefficients. Ridge shrinks coefficients continuously. Lasso can set coefficients to zero and create a simpler selected-predictor model.

The exam interpretation is not only mathematical. Explain the bias-variance tradeoff: regularization may increase bias while reducing variance and improving out-of-sample stability.

Ridge penalty shape

\sum_i(y_i-\hat y_i)^2+\lambda\sum_j\beta_j^2

Lasso penalty shape

\sum_i(y_i-\hat y_i)^2+\lambda\sum_j|\beta_j|

Trees And Ensembles

Tree methods split the predictor space into regions. Pruning and tuning control complexity. Ensemble methods improve predictive performance by combining trees, but the interpretation can become less direct.

For actuarial use, connect the method to the business task. A tree may expose segmentation logic. An ensemble may predict better. The right answer depends on validation performance, interpretability needs, and operational constraints.

Unsupervised Learning

Clustering and principal components analysis do not use a response variable in the same way supervised models do. Clustering groups observations by similarity. PCA finds directions of variation in predictors.

Do not describe a cluster as a proven risk class without validation. A cluster is a data pattern first; actuarial meaning requires review against loss, exposure, business variables, and stability.

Predictive Performance

Model selection criteria and predictive-performance measures should be tied to the modeling target. Classification accuracy, ROC/AUC, lift, deviance, holdout error, and calibration answer different questions.

A strong MAS-II answer states what the metric measures, why it fits the task, and what limitation remains.

Validation signal

A model has very low training error but much higher validation error. What is the most likely warning?

Overfitting

The gap says the model is fitting training-sample noise or overly specific structure.

Perfect calibration

Calibration concerns predicted probabilities or means, not a training-validation error gap by itself.

More exposure is always needed

More data may help, but the direct warning is model complexity relative to signal.

Original Practice Drill

A severity model with many interactions has training RMSE of 820 and validation RMSE of 1,410. A simpler regularized model has training RMSE of 930 and validation RMSE of 1,050. Which model is more defensible for production pricing and why?

A complete answer prefers the regularized model unless another business constraint changes the decision, because validation performance is stronger and the training-validation gap is smaller.