CAS path

MAS-I Extended Linear Models

Extended linear models are the largest MAS-I domain. The work centers on GLM setup, model choice, output interpretation, offsets, interactions, diagnostics, and model-comparison criteria.

Credential side
CAS
Primary intent
MAS-I extended linear models
Best next page
Poisson Regression Model Selection Worked Example
Official Source Map

CAS Exam MAS-I

CAS exam page and content outline are mapped for domain weights, item types, cognitive levels, table conventions, and reading groups.

source map reviewed
Last verified 2026-05-141 official source filesNo raw exam or textbook text published
Exam facts

What the official PDFs establish

Appointment length
4.5-hour appointment with a 4-hour exam duration.
Scheduled break
The appointment includes a scheduled 15-minute break plus tutorial/confidentiality/survey time.
Item types
Question formats include multiple choice, multiple selection, point and click, fill in the blank, and matching.
Weights

Topic and domain coverage

TopicWeightSource
Probability Models20-30%
Statistics20-30%
Extended Linear Models45-55%
Cognitive level: Remember5-10%
Cognitive level: Understand and Apply55-60%
Cognitive level: Analyze and Evaluate35-40%
Readings

Chapter and reading intelligence

  • Official readings

    The outline lists readings from Daniel, Dobson and Barnett, Hogg/McKean/Craig, James et al., Larsen, Ross, Struppeck, and Tse.

  • Extended linear models

    This is the largest content domain and should drive the first MAS-I concept cluster.

Materials

Official files used by the map

Source note: some study materials are private references. ActuaryPath links official sources and uses original explanations instead of republishing paid or copyrighted materials.

Quick Answer

The official MAS-I outline gives extended linear models a 45-55% weight, making it the largest domain. It names model selection, link functions, response distributions, software output, parameter and ANOVA tables, predictor types, interactions, offsets, AIC, BIC, deviance, R-squared, diagnostic plots, and EDA plots.

This is the domain most likely to separate candidates who memorized definitions from candidates who can read model output and defend a modeling decision.

GLM Core

A generalized linear model has a random component, a systematic component, and a link connecting the conditional mean to the linear predictor. For actuarial pricing and frequency/severity work, the model choice should follow the behavior of the response, not just a familiar formula.

The link function does not transform the raw response in isolation. It connects the mean of the response, conditional on predictors, to the linear predictor.

GLM mean structure
g(μi)=ηi=xiTβg(\mu_i)=\eta_i=x_i^T\beta
Variance function shape
Var(Yi)=ϕV(μi)\operatorname{Var}(Y_i)=\phi V(\mu_i)

Choosing Model Structure

Model structure includes the response distribution, link function, predictors, interactions, control variables, and offsets. The exam can describe data behavior and ask what structure fits the situation.

For claim counts, a Poisson or negative binomial model may be natural depending on variance behavior. For positive skewed costs, gamma or Tweedie thinking may appear nearby. For classification effects, categorical predictors and interactions need careful interpretation.

Offsets And Controls

Offsets are common in actuarial GLMs because exposure often changes the expected count or cost. An offset enters with coefficient fixed at one, so it adjusts the mean scale without estimating a new coefficient.

A control variable is different. It has an estimated coefficient and is included to account for a known effect while judging other predictors.

Log-link count model with exposure offset
logE[Yi]=log(ei)+xiTβ\log E[Y_i]=\log(e_i)+x_i^T\beta

Model Comparison

MAS-I candidates should know how AIC, BIC, deviance, and R-squared are used to evaluate competing models. The MAS-I tables give default AIC and BIC calculation conventions when an exam question does not give an alternate formula.

AIC and BIC balance fit and penalty differently. Lower values are usually preferred among comparable models, but the final answer still needs context: predictive performance, diagnostics, business reasonableness, and whether the compared models fit the same response on the same data.

CAS table convention
AIC=2(π^;y)+2p,BIC=2(π^;y)+log(n)p\mathrm{AIC}=-2\ell(\hat\pi;y)+2p,\qquad \mathrm{BIC}=-2\ell(\hat\pi;y)+\log(n)p

Diagnostics And EDA

Residual plots, marginal model plots, QQ plots, added-variable plots, boxplots, univariate plots, and histograms are not decoration. They show misspecification, outliers, nonlinearity, distribution mismatch, and variable relationships.

When a diagnostic plot is abnormal, name the problem and the modeling response. For example: nonconstant residual spread may suggest a different variance structure; curvature may suggest transformation or a new term; extreme points may need investigation rather than automatic deletion.

Output-reading mistakes

MistakeFix
Choosing the model with lower AIC while ignoring that the response or dataset changed.Compare criteria only across models fit to the same response and comparable data.
Explaining a log-link coefficient as an additive dollar change.For a log link, exponentiate the coefficient to interpret a multiplicative effect on the mean.
Treating an offset as a predictor with an estimated slope.Remember that an offset has coefficient fixed at one.

Original Practice Drill

A claim-count GLM uses a log link, Poisson response, territory factor, vehicle age, and log exposure offset. The coefficient for Territory B is 0.18 relative to Territory A. Interpret the coefficient, explain the offset, and name two diagnostics you would inspect before recommending the model.

A complete answer says Territory B has an estimated multiplicative mean effect of exp(0.18), explains that exposure scales expected counts with coefficient fixed at one, and uses diagnostics tied to fit and outliers rather than listing plots without purpose.

References and official sources