Chapter 27 Comparisons with Control

We will illustrate comparison to control using the cheese amino acid data set, which has four treatments (bacteria strain A added to the cheese, strain B added, both strains added, neither strain added).

> library(cfcdae)
> data("CheeseAminoAcid")
> CheeseAminoAcid
  strain freeAminoAcid
1   none         4.195
2   none         4.175
3      A         4.125
4      A         4.735
5      B         4.865
6      B         5.745
7     AB         6.155
8     AB         6.488
> cheese.separate <- lm(freeAminoAcid~strain,data=CheeseAminoAcid)
> anova(cheese.separate)
Analysis of Variance Table

Response: freeAminoAcid
          Df Sum Sq Mean Sq F value Pr(>F)  
strain     3 5.6279 1.87595  11.932 0.0183 *
Residuals  4 0.6289 0.15722                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> model.effects(cheese.separate,"strain")
        A        AB         B      none 
-0.630375  1.261125  0.244625 -0.875375 

27.1 compare.to.control() in cfcdae

To compare several treatments to a control, you can use compare.to.control() in cfcdae. You need to specify which treatment (factor level) is the control treatment. For the cheese amino acid data, the “none” treatment is actually a control treatment, so it might make sense to just compare the new treatments with the control. This will require a Dunnett adjustment to the p-values.

control=4 says that the fourth level of the factor is the control level. Only AB is different at the .05 level. The ref="none" means a factor level named "none", not the absence of a reference level.

You need to get the numerical index of the control level. Here we see that “none” is the fourth level.
> levels(CheeseAminoAcid$strain)
[1] "A"    "AB"   "B"    "none"
So we compare the fourth level of strain to all of the other levels of strain. This uses the Dunnett correction for p-values and confidence intervals.
> compare.to.control(cheese.separate,strain,control=4)
            difference      lower    upper
  A - none      0.2450 -1.1895146 1.679515
  B - none      1.1200 -0.3145146 2.554515
* AB - none     2.1365  0.7019854 3.571015

Only the AB treatment with both strains added is significantly different from control at the .05 level.

A second example is the turkey poult weight gain data.

27.2 Using contrast() in package emmeans

Function contrast() in emmeans has an option to do comparison with a reference treatment (its name for the control). You begin by creating the emm object, and then use contrast() telling it the method (treatment versus control) and which level is the reference. Note that contrast() is smart enough to recognize names for levels.

> library(emmeans)
> cheese.emm <- emmeans(cheese.separate, ~strain)
> contrast(cheese.emm,method="trt.vs.ctrl",ref="none")
 contrast  estimate    SE df t.ratio p.value
 A - none     0.245 0.397  4   0.618  0.8527
 AB - none    2.136 0.397  4   5.388  0.0141
 B - none     1.120 0.397  4   2.825  0.1112

P value adjustment: dunnettx method for 3 tests 

cfcdae and emmeans use slightly different approximations to the Dunnett distribution, so the results can be slightly different.

27.3 Compare to Best

Comparing to the best treatment is kind of like comparing to a control treatment, except that we don’t know ahead of time which factor level is best (either lowest or highest response depending on the situation), so we don’t know ahead of time what the control level would be. In addition, comparing to the best is a one-sided comparison, not a two-sided comparison.

We illustrate with the weed control in soybeans data set. Fourteen weed control treatments were randomly applied to 56 plots, four plots per treatment. The plots are later assessed visually for weeds, with the response being the percent of the plot that is non-weeds.

Note we look at a transformation of the response. The original response was percent non-weed; that should be high for a good treatment. In the transformed response, low values should be good.

The single mean model is soundly rejected.
> data(WeedControl)
> weeds.separate <- lm(sqrt(100-pct.control)~treatment,data=WeedControl)
> anova(weeds.separate)
Analysis of Variance Table

Response: sqrt(100 - pct.control)
          Df  Sum Sq Mean Sq F value    Pr(>F)    
treatment 13 105.960  8.1508  14.907 9.383e-12 ***
Residuals 42  22.965  0.5468                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What we really want to do is to determine which of these treatments is likely to be best. So we compare to best with a one-sided Dunnett. We have to add lowisbest=TRUE to make it compare with the smallest mean, otherwise it would compare with the largest mean.

At the .99 level, only 5 treatments (5, 9, 12, 13, 14) can be distinguished from the best, so with probability .99 the best treatment is one of the other nine.
> compare.to.best(weeds.separate,treatment,conf=.99,lowisbest=TRUE)
            difference allowance
* 14 - 1  4.754856e+00  1.717429
* 13 - 1  3.988022e+00  1.717429
* 12 - 1  3.114510e+00  1.717429
* 5 - 1   1.941476e+00  1.717429
* 9 - 1   1.847263e+00  1.717429
  3 - 1   1.680337e+00  1.717429
  2 - 1   1.615710e+00  1.717429
  4 - 1   1.543076e+00  1.717429
  8 - 1   1.519293e+00  1.717429
  10 - 1  6.180340e-01  1.717429
  7 - 1   6.180340e-01  1.717429
  6 - 1   4.125704e-01  1.717429
  11 - 1  4.218847e-15  1.717429
best is 1 0.000000e+00        NA