MAS-I Extended Linear Models
Extended linear models are the largest MAS-I domain. The work centers on GLM setup, model choice, output interpretation, offsets, interactions, diagnostics, and model-comparison criteria.
CAS Exam MAS-I
CAS exam page and content outline are mapped for domain weights, item types, cognitive levels, table conventions, and reading groups.
What the official PDFs establish
- Appointment length
- 4.5-hour appointment with a 4-hour exam duration.
- Scheduled break
- The appointment includes a scheduled 15-minute break plus tutorial/confidentiality/survey time.
- Item types
- Question formats include multiple choice, multiple selection, point and click, fill in the blank, and matching.
Topic and domain coverage
| Topic | Weight | Source |
|---|---|---|
| Probability Models | 20-30% | Source: MAS-I Content Outline 2025, p. 2 |
| Statistics | 20-30% | Source: MAS-I Content Outline 2025, p. 3 |
| Extended Linear Models | 45-55% | Source: MAS-I Content Outline 2025, p. 5 |
| Cognitive level: Remember | 5-10% | Source: MAS-I Content Outline 2025, p. 1 |
| Cognitive level: Understand and Apply | 55-60% | Source: MAS-I Content Outline 2025, p. 1 |
| Cognitive level: Analyze and Evaluate | 35-40% | Source: MAS-I Content Outline 2025, p. 1 |
Chapter and reading intelligence
- Official readings
The outline lists readings from Daniel, Dobson and Barnett, Hogg/McKean/Craig, James et al., Larsen, Ross, Struppeck, and Tse.
Source: MAS-I Content Outline 2025, p. 7 - Extended linear models
This is the largest content domain and should drive the first MAS-I concept cluster.
Source: MAS-I Content Outline 2025, p. 5
Official files used by the map
- CAS content outlinecontent-outline
Primary source for domain weights, item types, and readings.
Source: MAS-I Content Outline 2025
Quick Answer
The official MAS-I outline gives extended linear models a 45-55% weight, making it the largest domain. It names model selection, link functions, response distributions, software output, parameter and ANOVA tables, predictor types, interactions, offsets, AIC, BIC, deviance, R-squared, diagnostic plots, and EDA plots.
This is the domain most likely to separate candidates who memorized definitions from candidates who can read model output and defend a modeling decision.
GLM Core
A generalized linear model has a random component, a systematic component, and a link connecting the conditional mean to the linear predictor. For actuarial pricing and frequency/severity work, the model choice should follow the behavior of the response, not just a familiar formula.
The link function does not transform the raw response in isolation. It connects the mean of the response, conditional on predictors, to the linear predictor.
Choosing Model Structure
Model structure includes the response distribution, link function, predictors, interactions, control variables, and offsets. The exam can describe data behavior and ask what structure fits the situation.
For claim counts, a Poisson or negative binomial model may be natural depending on variance behavior. For positive skewed costs, gamma or Tweedie thinking may appear nearby. For classification effects, categorical predictors and interactions need careful interpretation.
Offsets And Controls
Offsets are common in actuarial GLMs because exposure often changes the expected count or cost. An offset enters with coefficient fixed at one, so it adjusts the mean scale without estimating a new coefficient.
A control variable is different. It has an estimated coefficient and is included to account for a known effect while judging other predictors.
Model Comparison
MAS-I candidates should know how AIC, BIC, deviance, and R-squared are used to evaluate competing models. The MAS-I tables give default AIC and BIC calculation conventions when an exam question does not give an alternate formula.
AIC and BIC balance fit and penalty differently. Lower values are usually preferred among comparable models, but the final answer still needs context: predictive performance, diagnostics, business reasonableness, and whether the compared models fit the same response on the same data.
Diagnostics And EDA
Residual plots, marginal model plots, QQ plots, added-variable plots, boxplots, univariate plots, and histograms are not decoration. They show misspecification, outliers, nonlinearity, distribution mismatch, and variable relationships.
When a diagnostic plot is abnormal, name the problem and the modeling response. For example: nonconstant residual spread may suggest a different variance structure; curvature may suggest transformation or a new term; extreme points may need investigation rather than automatic deletion.
Output-reading mistakes
| Mistake | Fix |
|---|---|
| Choosing the model with lower AIC while ignoring that the response or dataset changed. | Compare criteria only across models fit to the same response and comparable data. |
| Explaining a log-link coefficient as an additive dollar change. | For a log link, exponentiate the coefficient to interpret a multiplicative effect on the mean. |
| Treating an offset as a predictor with an estimated slope. | Remember that an offset has coefficient fixed at one. |
Original Practice Drill
A claim-count GLM uses a log link, Poisson response, territory factor, vehicle age, and log exposure offset. The coefficient for Territory B is 0.18 relative to Territory A. Interpret the coefficient, explain the offset, and name two diagnostics you would inspect before recommending the model.
A complete answer says Territory B has an estimated multiplicative mean effect of exp(0.18), explains that exposure scales expected counts with coefficient fixed at one, and uses diagnostics tied to fit and outliers rather than listing plots without purpose.