Chapter 18 Example 3.7: Model parameters
summary()
gives more
internal detail about the model fit, including estimates and standard
errors of the parameters in the model for the mean.
18.1 Factor-based Models
> summary(separate.means)
Call:
lm(formula = logTime ~ temp, data = myRL)
Residuals:
Min 1Q Median 3Q Max
-0.22667 -0.03667 0.00250 0.03125 0.20333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.43794 0.01585 90.708 < 2e-16 ***
temp1 0.49456 0.03065 16.134 < 2e-16 ***
temp2 0.19081 0.03065 6.225 5.67e-07 ***
temp3 -0.06044 0.03065 -1.972 0.0573 .
temp4 -0.24365 0.03222 -7.563 1.30e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.0958 on 32 degrees of freedom
Multiple R-squared: 0.9233, Adjusted R-squared: 0.9138
F-statistic: 96.36 on 4 and 32 DF, p-value: < 2.2e-16
Note that some of the individual coefficients can be “non-significant” even
within a highly significant term such as temp
.
What we call \(\widehat{\mu}\) is what
R calls (Intercept)
(sometimes called the Constant in other software),
so \(\widehat{\mu} = 1.43794\). The temp1
, temp2
, etc.
values are \(\widehat{\alpha}_1, \widehat{\alpha}_2, ...\). Thus the
estimate (or prediction) for the second treatment level is
\(\widehat{\mu} + \widehat{\alpha}_2 = 1.43794 + .19081 = 1.62875\)
as we saw above.
> summary(individual.means)
Call:
lm(formula = logTime ~ temp - 1, data = myRL)
Residuals:
Min 1Q Median 3Q Max
-0.22667 -0.03667 0.00250 0.03125 0.20333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
temp175 1.93250 0.03387 57.05 <2e-16 ***
temp194 1.62875 0.03387 48.09 <2e-16 ***
temp213 1.37750 0.03387 40.67 <2e-16 ***
temp231 1.19429 0.03621 32.98 <2e-16 ***
temp250 1.05667 0.03911 27.02 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.0958 on 32 degrees of freedom
Multiple R-squared: 0.9965, Adjusted R-squared: 0.9959
F-statistic: 1808 on 5 and 32 DF, p-value: < 2.2e-16
As seen above, emmeans()
does something similar (the foo::goo
form says take
object goo
from package/library foo
).
> emmeans::emmeans(separate.means,"temp")
temp emmean SE df lower.CL upper.CL
175 1.93 0.0339 32 1.864 2.00
194 1.63 0.0339 32 1.560 1.70
213 1.38 0.0339 32 1.309 1.45
231 1.19 0.0362 32 1.121 1.27
250 1.06 0.0391 32 0.977 1.14
Confidence level used: 0.95
R’s default parameterization for factors is that the first treatment effect is 0. However,
when you load the cfcdae
package, it changes the default parameterization
for factors to the sum of the effects is zero. We can thus figure out
the coefficient for treatment 5 as minus the sum of the other four coefficients.
The model.effects()
function in cfcdae
makes finding all the effects a little easier.
It’s not a big deal here, but it’s more compelling in more
complex models.
> model.effects(separate.means,"temp")
175 194 213 231 250
0.49455952 0.19080952 -0.06044048 -0.24365476 -0.38127381
A side-by-side plot (function sidebyside()
in cfcdae
)
is a way to see the size of effects relative to the size of the residuals
or other effects.
In this simple example, you can’t see much that you couldn’t already
see in the boxplots, but this approach becomes more insightful in
factorial models.
> sidebyside(separate.means)

Figure 18.1: Side by side plot for resin lifetime data
We have mentioned a couple of times that R’s
default is to use the “\(\widehat{\alpha}_1=0\)” parameterization, and
that the cfcdae
package resets that default to be that the treatment
effects sum to zero. We can force R to fit with the “\(\widehat{\alpha}_1=0\)”
parameterization by using this somewhat awkward optional argument
(or this is what you’d get if you fit the model before loading
cfcdae
). If you
wanted to force the sum to zero constraint you would use
contrasts=list(temp=contr.sum)
instead of contr.treatment
as shown
here.
> out2 <- lm(logTime~temp,data=myRL,contrasts=list(temp=contr.treatment))
> summary(out2)
Call:
lm(formula = logTime ~ temp, data = myRL, contrasts = list(temp = contr.treatment))
Residuals:
Min 1Q Median 3Q Max
-0.22667 -0.03667 0.00250 0.03125 0.20333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.93250 0.03387 57.055 < 2e-16 ***
temp2 -0.30375 0.04790 -6.341 4.06e-07 ***
temp3 -0.55500 0.04790 -11.586 5.49e-13 ***
temp4 -0.73821 0.04958 -14.889 6.13e-16 ***
temp5 -0.87583 0.05174 -16.928 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.0958 on 32 degrees of freedom
Multiple R-squared: 0.9233, Adjusted R-squared: 0.9138
F-statistic: 96.36 on 4 and 32 DF, p-value: < 2.2e-16
Note that we now get coefficients for the last four
levels of temp
. There is no coefficient shown for temp1
because we know that
it is zero when using this set of constraints on the coefficients.
We also see that the (Intercept)
value is the mean in the
first treatment group (which it must be since \(\alpha_1 = 0\)).
A little calculation shows
that \(\hat{\alpha}_i-\hat{\alpha}_j\) is the same for both approaches,
regardless of \(i\) or \(j\).
18.2 Polynomial Models
summary()
works for functional models as well. Here we use the
function coef(model)
to see just the model
coefficients of the standard polynomial fit of order two and
the corresponding orthogonal polynomial fit.
> coef(p2)
(Intercept) temp.z temp.z2
7.417999e+00 -4.509806e-02 7.860391e-05
> coef(op2)
(Intercept) poly(temp.z, degree = 2)1 poly(temp.z, degree = 2)2
1.4651351 -1.8599091 0.2798988
Note that the coefficients are different between p2
and op2
. This
is because the two models use different predictors (monomials versus
orthogonal polynomials), even though the overall models are equivalent
quadratic (second order) models.
Here we see that the first order (linear term) coefficient is different
in p1
and p2
, but it is the same in op1
and op2
. This is one of
the claims to fame for othogonal polynomials.
> coef(p1)
(Intercept) temp.z
3.95600752 -0.01185672
> coef(op1)
(Intercept) poly(temp.z, degree = 1)
1.465135 -1.859909
18.3 Over-Parameterized Models: Too Much of a Good Thing
If you have \(g\) groups, you can fit a model with up to \(g\) parameters in the mean structure, either via a factor or some functional form such as a polynomial. You can fit fewer than \(g\) (say quadratic instead of quartic), but you cannot fit more than \(g\).
If you try to fit more than \(g\), R won’t let you fit more than \(g\). The
lm()
call won’t crash, but it will toss out some of the potential predictors
so that you are left with only \(g\).
Depending on which predictors it tosses, the parameters it retains
might be easy to understand, or they could be completely incomprehensible.
Parameters are always arbitrary, because the model you choose and how you set
it up is arbitrary (factor with zero sum effects versus factor with first
coefficient 0 versus ordinary polynomial versus orthogonal polynomial versus
…). Combine that now with how your particular software package decides
to toss unneeded predictors, and, well, let’s just say that the fun
really begins.
summary()
or coef()
after fitting an over-parameterized
model, the predictors that lm()
has tossed will show up as NA
(missing).
So if you try to fit the fifth power to the resin lifetime data, it
tosses the quintic term out.
> p5 <- lm(logTime ~ temp.z + temp.z2 + temp.z3 + temp.z4 +
+ I(temp.z^5),data=myRL)
> coef(p5)
(Intercept) temp.z temp.z2 temp.z3 temp.z4
9.699492e-01 7.573323e-02 -7.648846e-04 2.600330e-06 -2.987894e-09
I(temp.z^5)
NA
> p5b <- lm(logTime ~ I(temp.z^5) + temp.z + temp.z2 + temp.z3 + temp.z4,
+ data=myRL)
> coef(p5b)
(Intercept) I(temp.z^5) temp.z temp.z2 temp.z3
2.143774e+00 -2.810813e-12 4.768732e-02 -4.979222e-04 1.334793e-06
temp.z4
NA
> p4.plus <- lm(logTime ~ temp.z + temp.z2 + temp.z3 + temp.z4 +
+ temp,data=myRL)
> coef(p4.plus)
(Intercept) temp.z temp.z2 temp.z3 temp.z4
9.699492e-01 7.573323e-02 -7.648846e-04 2.600330e-06 -2.987894e-09
temp1 temp2 temp3 temp4
NA NA NA NA
> separate.means.plus <- lm(logTime ~ temp + temp.z + temp.z2 + temp.z3 +
+ temp.z4,data=myRL)
> coef(separate.means.plus)
(Intercept) temp1 temp2 temp3 temp4 temp.z
1.43794048 0.49455952 0.19080952 -0.06044048 -0.24365476 NA
temp.z2 temp.z3 temp.z4
NA NA NA
anova()
just pretends
that the eliminated terms are not there at all.
> anova(p5)
Analysis of Variance Table
Response: logTime
Df Sum Sq Mean Sq F value Pr(>F)
temp.z 1 3.4593 3.4593 376.9128 < 2.2e-16 ***
temp.z2 1 0.0783 0.0783 8.5361 0.006338 **
temp.z3 1 0.0000 0.0000 0.0020 0.964399
temp.z4 1 0.0000 0.0000 0.0009 0.976258
Residuals 32 0.2937 0.0092
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(p5b)
Analysis of Variance Table
Response: logTime
Df Sum Sq Mean Sq F value Pr(>F)
I(temp.z^5) 1 3.10714 3.10714 338.547 < 2.2e-16 ***
temp.z 1 0.43006 0.43006 46.858 9.627e-08 ***
temp.z2 1 0.00042 0.00042 0.046 0.8316
temp.z3 1 0.00001 0.00001 0.001 0.9751
Residuals 32 0.29369 0.00918
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(p4.plus)
Analysis of Variance Table
Response: logTime
Df Sum Sq Mean Sq F value Pr(>F)
temp.z 1 3.4593 3.4593 376.9128 < 2.2e-16 ***
temp.z2 1 0.0783 0.0783 8.5361 0.006338 **
temp.z3 1 0.0000 0.0000 0.0020 0.964399
temp.z4 1 0.0000 0.0000 0.0009 0.976258
Residuals 32 0.2937 0.0092
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(separate.means.plus)
Analysis of Variance Table
Response: logTime
Df Sum Sq Mean Sq F value Pr(>F)
temp 4 3.5376 0.88441 96.363 < 2.2e-16 ***
Residuals 32 0.2937 0.00918
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In these examples, the model that was ultimately fit was
reasonably understandable, yielding the quartic model (p5
and p4.plus
)
or the separate means model (separate.means.plus
). The terms
retained in p5b
are not hierarchical (quartic is missing), so
while it is understandable, it is not a model we like to use.
> p2.plus <- lm(logTime ~ temp.z + temp.z2 + temp,data=myRL)
> coef(p2.plus)
(Intercept) temp.z temp.z2 temp1 temp2
7.469597e+00 -4.558867e-02 7.975247e-05 -1.498976e-03 1.791155e-03
temp3 temp4
NA NA
> anova(p2.plus)
Analysis of Variance Table
Response: logTime
Df Sum Sq Mean Sq F value Pr(>F)
temp.z 1 3.4593 3.4593 376.9128 < 2.2e-16 ***
temp.z2 1 0.0783 0.0783 8.5361 0.006338 **
temp 2 0.0000 0.0000 0.0015 0.998540
Residuals 32 0.2937 0.0092
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(p2,separate.means)
Analysis of Variance Table
Model 1: logTime ~ temp.z + temp.z2
Model 2: logTime ~ temp
Res.Df RSS Df Sum of Sq F Pr(>F)
1 34 0.29372
2 32 0.29369 2 2.6829e-05 0.0015 0.9985
Note that the degrees of freedom and sums of squares for temp
in p2.plus
are the
same as that going from the quadratic model to the
separate means model. That can sometimes be useful.
But for the love of heaven, don’t try to make sense out of the coefficients when you have over-parameterized by mixing polynomial and factor predictors for the same factor. It can be done, but there lies madness. In particular, when you have over-parameterized and used too many predictors, there are infinitely many different sets of coefficients that will lead to the same set of fitted values, the vast majority of these sets of coefficients are not understandable, and which set you get depends delicately on your software.