Estimated Marginal a.k.a. Least Square Means
At their simplest, estimated marginal means, a.k.a least-square means, are just effects: predictions and associated errors computed from a model at specific points on a reference grid. The "marginal" here refers to estimation at the margins of the table, i.e. either averaging over or computing at specific values of other variables, often the means of other variables. Effects.jl provides a convenient interface for computing EM means at all levels of categorical variables and the means of any continuous covariates. An example will make this clear:
using DataFrames, Effects, GLM, StatsModels, StableRNGs, StandardizedPredictors
rng = StableRNG(42)
growthdata = DataFrame(; age=[13:20; 13:20],
sex=repeat(["male", "female"], inner=8),
weight=[range(100, 155; length=8); range(100, 125; length=8)] .+ randn(rng, 16))
model_scaled = lm(@formula(weight ~ 1 + sex * age), growthdata;
contrasts=Dict(:age => ZScore(), :sex => DummyCoding()))
emmeans(model_scaled)
Row | sex | age | weight | err |
---|---|---|---|---|
String | Float64 | Float64 | Float64 | |
1 | female | 16.5 | 112.694 | 0.397181 |
2 | male | 16.5 | 127.682 | 0.397181 |
(This is the same example data as Interaction Terms in Effects.)
Notice that we could have achieved the same result by using effects
and specifying the levels manually:
design = Dict(:sex => ["female", "male"], :age => 16.5)
grid = expand_grid(design)
effects!(grid, model_scaled)
Row | sex | age | weight | err |
---|---|---|---|---|
String | Float64 | Float64 | Float64 | |
1 | female | 16.5 | 112.694 | 0.397181 |
2 | male | 16.5 | 127.682 | 0.397181 |
emmeans
is primarily a convenience function for computing effects on a convenient, pre-defined reference grid. Notably, emmeans
includes information by default about all variables, even the ones analyzed only at their typical values. This differs from effects!
:
design = Dict(:sex => ["female", "male"])
grid = expand_grid(design)
effects!(grid, model_scaled)
Row | sex | weight | err |
---|---|---|---|
String | Float64 | Float64 | |
1 | female | 112.694 | 0.397181 |
2 | male | 127.682 | 0.397181 |
Pairwise Comparisons
EM means are also useful for post-hoc examination of pairwise differences. This is implemented via the empairs
function:
empairs(model_scaled)
Row | sex | age | weight | err |
---|---|---|---|---|
String | Float64 | Float64 | Float64 | |
1 | female > male | 16.5 | -14.9882 | 0.561699 |
Or for a more advanced case:
using MixedModels
kb07 = MixedModels.dataset(:kb07)
mixed_form = @formula(rt_trunc ~ 1 + spkr * prec * load + (1|item) + (1|subj))
mixed_model = fit(MixedModel, mixed_form, kb07; progress=false)
empairs(mixed_model)
Row | spkr | load | prec | rt_trunc | err |
---|---|---|---|---|---|
String | String | String | Float64 | Float64 | |
1 | new > old | no | break | -189.655 | 128.645 |
2 | new | no > yes | break | -158.7 | 128.645 |
3 | new > old | no > yes | break | -327.682 | 128.645 |
4 | new | no | break > maintain | 585.983 | 128.645 |
5 | new > old | no | break > maintain | 576.028 | 128.73 |
6 | new | no > yes | break > maintain | 503.358 | 128.645 |
7 | new > old | no > yes | break > maintain | 327.809 | 128.645 |
8 | old > new | no > yes | break | 30.9554 | 128.603 |
9 | old | no > yes | break | -138.027 | 128.603 |
10 | old > new | no | break > maintain | 775.638 | 128.603 |
11 | old | no | break > maintain | 765.684 | 128.688 |
12 | old > new | no > yes | break > maintain | 693.013 | 128.603 |
13 | old | no > yes | break > maintain | 517.464 | 128.603 |
14 | new > old | yes | break | -168.982 | 128.603 |
15 | new | yes > no | break > maintain | 744.683 | 128.603 |
16 | new > old | yes > no | break > maintain | 734.728 | 128.688 |
17 | new | yes | break > maintain | 662.058 | 128.603 |
18 | new > old | yes | break > maintain | 486.509 | 128.603 |
19 | old > new | yes > no | break > maintain | 913.665 | 128.603 |
20 | old | yes > no | break > maintain | 903.71 | 128.688 |
21 | old > new | yes | break > maintain | 831.04 | 128.603 |
22 | old | yes | break > maintain | 655.491 | 128.603 |
23 | new > old | no | maintain | -9.95474 | 128.688 |
24 | new | no > yes | maintain | -82.625 | 128.603 |
25 | new > old | no > yes | maintain | -258.174 | 128.603 |
26 | old > new | no > yes | maintain | -72.6703 | 128.688 |
27 | old | no > yes | maintain | -248.219 | 128.688 |
28 | new > old | yes | maintain | -175.549 | 128.603 |
If we provide a method for computing the degrees of freedom or an appropriate value, then empairs
will also generate test statistics:
empairs(model_scaled; dof=dof_residual)
Row | sex | age | weight | err | dof | t | Pr(>|t|) |
---|---|---|---|---|---|---|---|
String | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | female > male | 16.5 | -14.9882 | 0.561699 | 12.0 | -26.6838 | 4.70971e-12 |
These degrees of freedom correspond to the denominator degrees of freedom in the associated F-tests.
Mixed-effects Models
For mixed-effects models, the denominator degrees of freedom are not clearly defined[GLMMFAQ] (nor is it clear that the asymptotic distribution is even F for anything beyond a few special cases[Bates2006]). The fundamental problem is that mixed models have in some sense "more" degrees of freedom than you would expect from naively counting parameters – the conditional modes (BLUPs) aren't technically parameters, but do somehow contribute to increasing the amount of wiggle-room that the model has for adapting to the data. We can try a few different approaches and see how they differ.
We'll use a simplified version of the model above:
mixed_form2 = @formula(rt_trunc ~ 1 + prec * load + (1|item) + (1|subj))
mixed_model2 = fit(MixedModel, mixed_form2, kb07; progress=false)
Est. | SE | z | p | σ_subj | σ_item | |
---|---|---|---|---|---|---|
(Intercept) | 2441.6881 | 84.3815 | 28.94 | <1e-99 | 318.4042 | 363.6708 |
prec: maintain | -676.0779 | 48.5432 | -13.93 | <1e-43 | ||
load: yes | 148.1423 | 48.4865 | 3.06 | 0.0022 | ||
prec: maintain & load: yes | 17.3034 | 68.5904 | 0.25 | 0.8008 | ||
Residual | 725.2555 |
- the default definition in MixedModels.jl for
dof_residual(model)
is simplynobs(model) - dof(model)
empairs(mixed_model2; dof=dof_residual)
Row | load | prec | rt_trunc | err | dof | t | Pr(>|t|) |
---|---|---|---|---|---|---|---|
String | String | Float64 | Float64 | Int64 | Float64 | Float64 | |
1 | no > yes | break | -148.142 | 119.322 | 1782 | -1.24153 | 0.214572 |
2 | no | break > maintain | 676.078 | 119.345 | 1782 | 5.66491 | 1.71183e-8 |
3 | no > yes | break > maintain | 510.632 | 119.322 | 1782 | 4.27945 | 1.97306e-5 |
4 | yes > no | break > maintain | 824.22 | 119.333 | 1782 | 6.90687 | 6.86976e-12 |
5 | yes | break > maintain | 658.775 | 119.31 | 1782 | 5.52152 | 3.85549e-8 |
6 | no > yes | maintain | -165.446 | 119.333 | 1782 | -1.38641 | 0.165794 |
- the leverage formula, given by
nobs(model) - sum(leverage(model))
(for classical linear models, this is exactly the same as (1))
empairs(mixed_model2; dof=nobs(mixed_model) - sum(leverage(mixed_model)))
Row | load | prec | rt_trunc | err | dof | t | Pr(>|t|) |
---|---|---|---|---|---|---|---|
String | String | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | no > yes | break | -148.142 | 119.322 | 1704.63 | -1.24153 | 0.214579 |
2 | no | break > maintain | 676.078 | 119.345 | 1704.63 | 5.66491 | 1.7235e-8 |
3 | no > yes | break > maintain | 510.632 | 119.322 | 1704.63 | 4.27945 | 1.97767e-5 |
4 | yes > no | break > maintain | 824.22 | 119.333 | 1704.63 | 6.90687 | 6.97054e-12 |
5 | yes | break > maintain | 658.775 | 119.31 | 1704.63 | 5.52152 | 3.87932e-8 |
6 | no > yes | maintain | -165.446 | 119.333 | 1704.63 | -1.38641 | 0.165802 |
- counting each conditional mode as a parameter (this is the most conservative value because it has the smallest degrees of freedom)
empairs(mixed_model2; dof=nobs(mixed_model) - sum(size(mixed_model)[2:3]))
Row | load | prec | rt_trunc | err | dof | t | Pr(>|t|) |
---|---|---|---|---|---|---|---|
String | String | Float64 | Float64 | Int64 | Float64 | Float64 | |
1 | no > yes | break | -148.142 | 119.322 | 1693 | -1.24153 | 0.21458 |
2 | no | break > maintain | 676.078 | 119.345 | 1693 | 5.66491 | 1.72535e-8 |
3 | no > yes | break > maintain | 510.632 | 119.322 | 1693 | 4.27945 | 1.9784e-5 |
4 | yes > no | break > maintain | 824.22 | 119.333 | 1693 | 6.90687 | 6.9866e-12 |
5 | yes | break > maintain | 658.775 | 119.31 | 1693 | 5.52152 | 3.8831e-8 |
6 | no > yes | maintain | -165.446 | 119.333 | 1693 | -1.38641 | 0.165803 |
- the infinite degrees of freedom (i.e. interpret $t$ as $z$ and $F$ as $\chi^2$ )
empairs(mixed_model2; dof=Inf)
Row | load | prec | rt_trunc | err | dof | t | Pr(>|t|) |
---|---|---|---|---|---|---|---|
String | String | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | no > yes | break | -148.142 | 119.322 | Inf | -1.24153 | 0.214408 |
2 | no | break > maintain | 676.078 | 119.345 | Inf | 5.66491 | 1.47106e-8 |
3 | no > yes | break > maintain | 510.632 | 119.322 | Inf | 4.27945 | 1.87356e-5 |
4 | yes > no | break > maintain | 824.22 | 119.333 | Inf | 6.90687 | 4.9548e-12 |
5 | yes | break > maintain | 658.775 | 119.31 | Inf | 5.52152 | 3.36087e-8 |
6 | no > yes | maintain | -165.446 | 119.333 | Inf | -1.38641 | 0.16562 |
These values are all very similar to each other, because the $t$ distribution rapidly converges to the $z$ distribution for $t > 30$ and so amount of probability mass in the tails does not change much for a model with more than a thousand observations and 30+ levels of each grouping variable.
Multiple Comparisons Correction
A perhaps more important concern is inflation of Type-1 error rate through multiple testing. We can also provide a correction method to be applied to the p-values.
using MultipleTesting
bonferroni(pvals) = adjust(PValues(pvals), Bonferroni())
empairs(mixed_model2; dof=Inf, padjust=bonferroni)
Row | load | prec | rt_trunc | err | dof | t | Pr(>|t|) |
---|---|---|---|---|---|---|---|
String | String | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | no > yes | break | -148.142 | 119.322 | Inf | -1.24153 | 1.0 |
2 | no | break > maintain | 676.078 | 119.345 | Inf | 5.66491 | 8.82634e-8 |
3 | no > yes | break > maintain | 510.632 | 119.322 | Inf | 4.27945 | 0.000112414 |
4 | yes > no | break > maintain | 824.22 | 119.333 | Inf | 6.90687 | 2.97288e-11 |
5 | yes | break > maintain | 658.775 | 119.31 | Inf | 5.52152 | 2.01652e-7 |
6 | no > yes | maintain | -165.446 | 119.333 | Inf | -1.38641 | 0.993722 |
A Final Note
There are many subleties in more advanced applications of EM means. These are discussed at length in the documentation for the R package emmeans
. For example:
- the impact of transformations and link functions
- derived covariates, e.g., including both
x
andx^2
in a model - weighting of cells in computing means
All of these problems are fundamental to EM means as a technique and not particular to the software implementation. Effects.jl does not yet support everything that R's emmeans
does, but all of the fundamental statistical concerns discussed in the latter's documentation still apply.