Concept

Correlation

Correlation rescales covariance so you can read direction and strength without being trapped by the original units of the variables. On Exam P, it is the cleaner way to compare dependence across different settings.

Page Contract
Role
Concept
Level
Core
Time
Reference
Freshness
Stable
Search Intent
correlation coefficient

Plain-English Definition

Correlation is covariance after standardizing by the two standard deviations. That means it keeps the sign and dependence direction of covariance, but removes the original units.

This matters because raw covariance can look large or small for reasons that have more to do with scale than with genuine association. Correlation gives a cleaner summary when the question is really about how strongly two variables move together.

Correlation coefficient
ρX,Y=Cov(X,Y)σXσY\rho_{X,Y}=\frac{\operatorname{Cov}(X,Y)}{\sigma_X\sigma_Y}
Bounds
1ρX,Y1-1 \le \rho_{X,Y} \le 1

Worked Example

Using the same discrete joint example as the covariance page, we already found Cov(X,Y)=0.05. Because Var(X)=0.25 and Var(Y)=0.21, the standard deviations are 0.50 and about 0.458.

So the correlation is 0.05 / (0.50 x 0.458), about 0.218. The sign is positive, which matches the qualitative story: higher values of X are somewhat associated with higher values of Y, but the relationship is not especially strong.

Why It Matters On Exam P

The syllabus explicitly names the correlation coefficient, so it is not just a side note. It is part of the official multivariate learning outcomes alongside covariance.

Exam P questions often use correlation to test whether you understand that standardized dependence is not the same thing as independence, causation, or a guaranteed linear relationship in every practical sense.

Common Mistakes

A common mistake is treating correlation as a percentage. Another is assuming that a correlation near zero means the variables are unrelated in every meaningful way. It only says there is little linear association in the standardized sense.

Candidates also sometimes forget that perfect plus or minus one correlation is extremely strong structure, not just a large covariance.

Statistics Connection

Correlation shows up everywhere in statistics and ML: feature screening, covariance matrices, PCA intuition, portfolio diversification, and model diagnostics. Exam P covers the simplest version, but it is still a foundational language piece.

References And Official Sources