Chapter 43 Confounding

43.1 conf.design

It might not surprise you to learn that the package conf.design contains tools for working with designs with confounding.

> library(cfcdae)
Registered S3 method overwritten by 'DoE.base':
  method           from       
  factorize.factor conf.design
> library(conf.design)
> library(car)
Loading required package: carData

43.1.1 Generating designs

conf.design uses matrices to denote generators for designs. Each row of a matrix corresponds to a generator, and the columns correspond to all the factors in the design, regardless of how many are in the generator. A 1 indicates that the factor is in the generator, and a 0 indicates it is not.

Here is a matrix for a \(2^4\) design with ABD as generator.

> ABD.4 <- t(c(1,1,0,1))
> ABD.4
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    1

Here is a matrix for a \(2^4\) design with ABD and CD as generators. We show a variety of ways of creating the same matrix. I think the first is easiest to think about, but to each his own.

> ABD.CD.4 <- rbind(c(1,1,0,1),c(0,0,1,1))
> ABD.CD.4
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    1
[2,]    0    0    1    1
> ABD.CD.4 <- matrix(c(1,0,1,0,0,1,1,1),nrow=2)
> ABD.CD.4
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    1
[2,]    0    0    1    1
> ABD.CD.4 <- matrix(c(1,1,0,1,0,0,1,1),nrow=2,byrow=TRUE)
> ABD.CD.4
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    1
[2,]    0    0    1    1

This is for ABD and ABCD; we know that this is a bad idea.

> ABD.ABCD.4 <- rbind(c(1,1,0,1),c(1,1,1,1))
> ABD.ABCD.4
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    1
[2,]    1    1    1    1

And lets make a big one. Use ABCE, ABDF, ACDG, and BCDH as generators in a \(2^8\). Note that we can name the variables, which could save us some bother later on.

> big8 <- rbind(c(A=1,B=1,C=1,D=0,E=1,F=0,G=0,H=0),c(1,1,0,1,0,1,0,0),c(1,0,1,1,0,0,1,0), c(0,1,1,1,0,0,0,1))
> big8
     A B C D E F G H
[1,] 1 1 1 0 1 0 0 0
[2,] 1 1 0 1 0 1 0 0
[3,] 1 0 1 1 0 0 1 0
[4,] 0 1 1 1 0 0 0 1

When setting up any confounded design, we need to know what we are confounding. conf.set() takes a matrix of generators and returns all of the confounded effects. The 2 as a second argument says that we are in a two-series design.

OK, it’s pretty straitforward to figure out what is confounded when we only have one generator.

> conf.set(ABD.4,2)
[1] 1 1 0 1

And it’s really pretty easy when there are only two generators, because there is only one generalized interaction.

> conf.set(ABD.CD.4,2)
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    1
[2,]    0    0    1    1
[3,]    1    1    1    0

And this shows that if you use ABD and ABCD you also confound the main effect of C. Bad idea.

> conf.set(ABD.ABCD.4,2)
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    1
[2,]    1    1    1    1
[3,]    0    0    1    0

When we have four generators creating 16 blocks in a \(2^8\), then conf.set() really helps us.

> conf.set(big8,2)
      A B C D E F G H
 [1,] 1 1 1 0 1 0 0 0
 [2,] 1 1 0 1 0 1 0 0
 [3,] 0 0 1 1 1 1 0 0
 [4,] 1 0 1 1 0 0 1 0
 [5,] 0 1 0 1 1 0 1 0
 [6,] 0 1 1 0 0 1 1 0
 [7,] 1 0 0 0 1 1 1 0
 [8,] 0 1 1 1 0 0 0 1
 [9,] 1 0 0 1 1 0 0 1
[10,] 1 0 1 0 0 1 0 1
[11,] 0 1 0 0 1 1 0 1
[12,] 1 1 0 0 0 0 1 1
[13,] 0 0 1 0 1 0 1 1
[14,] 0 0 0 1 0 1 1 1
[15,] 1 1 1 1 1 1 1 1

If you don’t like going from 0/1 to the letter form, you can have R do it without too much stress.

> letter.effect <- function(binary) {
+ Letters <- LETTERS[1:length(binary)]
+ paste(Letters[binary>0],collapse=":")
+ }
> apply(conf.set(big8,2),1,letter.effect)
 [1] "A:B:C:E"         "A:B:D:F"         "C:D:E:F"         "A:C:D:G"        
 [5] "B:D:E:G"         "B:C:F:G"         "A:E:F:G"         "B:C:D:H"        
 [9] "A:D:E:H"         "A:C:F:H"         "B:E:F:H"         "A:B:G:H"        
[13] "C:E:G:H"         "D:F:G:H"         "A:B:C:D:E:F:G:H"

In addition to finding the generalized interactions, we can use conf.design() to determine the actual design.

With just two blocks, blocks are indicated by a single 0/1. Levels of each treatment are also indicated by 0/1. Thus block 0 below includes (1), ab, c, abc, ad, bd, acd, and bcd.

> conf.design(ABD.4,2)
   Blocks T1 T2 T3 T4
1       0  0  0  0  0
2       0  1  1  0  0
3       0  0  0  1  0
4       0  1  1  1  0
5       0  1  0  0  1
6       0  0  1  0  1
7       0  1  0  1  1
8       0  0  1  1  1
9       1  1  0  0  0
10      1  0  1  0  0
11      1  1  0  1  0
12      1  0  1  1  0
13      1  0  0  0  1
14      1  1  1  0  1
15      1  0  0  1  1
16      1  1  1  1  1

With four blocks, blocks are indicated by pairs 00, 01, 10, or 11 corresponding to the two generators. Below, the 01 block includes c, abc, ad, bd.

> conf.design(ABD.CD.4,2)
   Blocks T1 T2 T3 T4
1      00  0  0  0  0
2      00  1  1  0  0
3      00  1  0  1  1
4      00  0  1  1  1
5      01  0  0  1  0
6      01  1  1  1  0
7      01  1  0  0  1
8      01  0  1  0  1
9      10  1  0  0  0
10     10  0  1  0  0
11     10  0  0  1  1
12     10  1  1  1  1
13     11  1  0  1  0
14     11  0  1  1  0
15     11  0  0  0  1
16     11  1  1  0  1

And now the big one. At least with my defaults, R won’t even print out the entire design. Sixteen blocks indicated by 0000, 0001, 0010, 0011, … 1111.

> conf.design(big8,2)
    Blocks A B C D E F G H
1     0000 0 0 0 0 0 0 0 0
2     0000 1 1 1 0 1 0 0 0
3     0000 1 1 0 1 0 1 0 0
4     0000 0 0 1 1 1 1 0 0
5     0000 1 0 1 1 0 0 1 0
6     0000 0 1 0 1 1 0 1 0
7     0000 0 1 1 0 0 1 1 0
8     0000 1 0 0 0 1 1 1 0
9     0000 0 1 1 1 0 0 0 1
10    0000 1 0 0 1 1 0 0 1
11    0000 1 0 1 0 0 1 0 1
12    0000 0 1 0 0 1 1 0 1
13    0000 1 1 0 0 0 0 1 1
14    0000 0 0 1 0 1 0 1 1
15    0000 0 0 0 1 0 1 1 1
16    0000 1 1 1 1 1 1 1 1
17    0001 0 1 1 1 0 0 0 0
18    0001 1 0 0 1 1 0 0 0
19    0001 1 0 1 0 0 1 0 0
20    0001 0 1 0 0 1 1 0 0
21    0001 1 1 0 0 0 0 1 0
22    0001 0 0 1 0 1 0 1 0
23    0001 0 0 0 1 0 1 1 0
24    0001 1 1 1 1 1 1 1 0
25    0001 0 0 0 0 0 0 0 1
26    0001 1 1 1 0 1 0 0 1
27    0001 1 1 0 1 0 1 0 1
28    0001 0 0 1 1 1 1 0 1
29    0001 1 0 1 1 0 0 1 1
30    0001 0 1 0 1 1 0 1 1
31    0001 0 1 1 0 0 1 1 1
32    0001 1 0 0 0 1 1 1 1
33    0010 1 0 1 1 0 0 0 0
34    0010 0 1 0 1 1 0 0 0
35    0010 0 1 1 0 0 1 0 0
36    0010 1 0 0 0 1 1 0 0
37    0010 0 0 0 0 0 0 1 0
38    0010 1 1 1 0 1 0 1 0
39    0010 1 1 0 1 0 1 1 0
40    0010 0 0 1 1 1 1 1 0
41    0010 1 1 0 0 0 0 0 1
42    0010 0 0 1 0 1 0 0 1
43    0010 0 0 0 1 0 1 0 1
44    0010 1 1 1 1 1 1 0 1
45    0010 0 1 1 1 0 0 1 1
46    0010 1 0 0 1 1 0 1 1
47    0010 1 0 1 0 0 1 1 1
48    0010 0 1 0 0 1 1 1 1
49    0011 1 1 0 0 0 0 0 0
50    0011 0 0 1 0 1 0 0 0
51    0011 0 0 0 1 0 1 0 0
52    0011 1 1 1 1 1 1 0 0
53    0011 0 1 1 1 0 0 1 0
54    0011 1 0 0 1 1 0 1 0
55    0011 1 0 1 0 0 1 1 0
56    0011 0 1 0 0 1 1 1 0
57    0011 1 0 1 1 0 0 0 1
58    0011 0 1 0 1 1 0 0 1
59    0011 0 1 1 0 0 1 0 1
60    0011 1 0 0 0 1 1 0 1
61    0011 0 0 0 0 0 0 1 1
62    0011 1 1 1 0 1 0 1 1
63    0011 1 1 0 1 0 1 1 1
64    0011 0 0 1 1 1 1 1 1
65    0100 1 1 0 1 0 0 0 0
66    0100 0 0 1 1 1 0 0 0
67    0100 0 0 0 0 0 1 0 0
68    0100 1 1 1 0 1 1 0 0
69    0100 0 1 1 0 0 0 1 0
70    0100 1 0 0 0 1 0 1 0
71    0100 1 0 1 1 0 1 1 0
72    0100 0 1 0 1 1 1 1 0
73    0100 1 0 1 0 0 0 0 1
74    0100 0 1 0 0 1 0 0 1
75    0100 0 1 1 1 0 1 0 1
76    0100 1 0 0 1 1 1 0 1
77    0100 0 0 0 1 0 0 1 1
78    0100 1 1 1 1 1 0 1 1
79    0100 1 1 0 0 0 1 1 1
80    0100 0 0 1 0 1 1 1 1
81    0101 1 0 1 0 0 0 0 0
82    0101 0 1 0 0 1 0 0 0
83    0101 0 1 1 1 0 1 0 0
84    0101 1 0 0 1 1 1 0 0
85    0101 0 0 0 1 0 0 1 0
86    0101 1 1 1 1 1 0 1 0
87    0101 1 1 0 0 0 1 1 0
88    0101 0 0 1 0 1 1 1 0
89    0101 1 1 0 1 0 0 0 1
90    0101 0 0 1 1 1 0 0 1
91    0101 0 0 0 0 0 1 0 1
92    0101 1 1 1 0 1 1 0 1
93    0101 0 1 1 0 0 0 1 1
94    0101 1 0 0 0 1 0 1 1
95    0101 1 0 1 1 0 1 1 1
96    0101 0 1 0 1 1 1 1 1
97    0110 0 1 1 0 0 0 0 0
98    0110 1 0 0 0 1 0 0 0
99    0110 1 0 1 1 0 1 0 0
100   0110 0 1 0 1 1 1 0 0
101   0110 1 1 0 1 0 0 1 0
102   0110 0 0 1 1 1 0 1 0
103   0110 0 0 0 0 0 1 1 0
104   0110 1 1 1 0 1 1 1 0
105   0110 0 0 0 1 0 0 0 1
106   0110 1 1 1 1 1 0 0 1
107   0110 1 1 0 0 0 1 0 1
108   0110 0 0 1 0 1 1 0 1
109   0110 1 0 1 0 0 0 1 1
110   0110 0 1 0 0 1 0 1 1
111   0110 0 1 1 1 0 1 1 1
112   0110 1 0 0 1 1 1 1 1
113   0111 0 0 0 1 0 0 0 0
114   0111 1 1 1 1 1 0 0 0
115   0111 1 1 0 0 0 1 0 0
116   0111 0 0 1 0 1 1 0 0
117   0111 1 0 1 0 0 0 1 0
118   0111 0 1 0 0 1 0 1 0
119   0111 0 1 1 1 0 1 1 0
120   0111 1 0 0 1 1 1 1 0
121   0111 0 1 1 0 0 0 0 1
122   0111 1 0 0 0 1 0 0 1
123   0111 1 0 1 1 0 1 0 1
124   0111 0 1 0 1 1 1 0 1
125   0111 1 1 0 1 0 0 1 1
126   0111 0 0 1 1 1 0 1 1
127   0111 0 0 0 0 0 1 1 1
128   0111 1 1 1 0 1 1 1 1
129   1000 1 1 1 0 0 0 0 0
130   1000 0 0 0 0 1 0 0 0
131   1000 0 0 1 1 0 1 0 0
132   1000 1 1 0 1 1 1 0 0
133   1000 0 1 0 1 0 0 1 0
134   1000 1 0 1 1 1 0 1 0
135   1000 1 0 0 0 0 1 1 0
136   1000 0 1 1 0 1 1 1 0
137   1000 1 0 0 1 0 0 0 1
138   1000 0 1 1 1 1 0 0 1
139   1000 0 1 0 0 0 1 0 1
140   1000 1 0 1 0 1 1 0 1
141   1000 0 0 1 0 0 0 1 1
142   1000 1 1 0 0 1 0 1 1
143   1000 1 1 1 1 0 1 1 1
144   1000 0 0 0 1 1 1 1 1
145   1001 1 0 0 1 0 0 0 0
146   1001 0 1 1 1 1 0 0 0
147   1001 0 1 0 0 0 1 0 0
148   1001 1 0 1 0 1 1 0 0
149   1001 0 0 1 0 0 0 1 0
150   1001 1 1 0 0 1 0 1 0
151   1001 1 1 1 1 0 1 1 0
152   1001 0 0 0 1 1 1 1 0
153   1001 1 1 1 0 0 0 0 1
154   1001 0 0 0 0 1 0 0 1
155   1001 0 0 1 1 0 1 0 1
156   1001 1 1 0 1 1 1 0 1
157   1001 0 1 0 1 0 0 1 1
158   1001 1 0 1 1 1 0 1 1
159   1001 1 0 0 0 0 1 1 1
160   1001 0 1 1 0 1 1 1 1
161   1010 0 1 0 1 0 0 0 0
162   1010 1 0 1 1 1 0 0 0
163   1010 1 0 0 0 0 1 0 0
164   1010 0 1 1 0 1 1 0 0
165   1010 1 1 1 0 0 0 1 0
166   1010 0 0 0 0 1 0 1 0
167   1010 0 0 1 1 0 1 1 0
168   1010 1 1 0 1 1 1 1 0
169   1010 0 0 1 0 0 0 0 1
170   1010 1 1 0 0 1 0 0 1
171   1010 1 1 1 1 0 1 0 1
172   1010 0 0 0 1 1 1 0 1
173   1010 1 0 0 1 0 0 1 1
174   1010 0 1 1 1 1 0 1 1
175   1010 0 1 0 0 0 1 1 1
176   1010 1 0 1 0 1 1 1 1
177   1011 0 0 1 0 0 0 0 0
178   1011 1 1 0 0 1 0 0 0
179   1011 1 1 1 1 0 1 0 0
180   1011 0 0 0 1 1 1 0 0
181   1011 1 0 0 1 0 0 1 0
182   1011 0 1 1 1 1 0 1 0
183   1011 0 1 0 0 0 1 1 0
184   1011 1 0 1 0 1 1 1 0
185   1011 0 1 0 1 0 0 0 1
186   1011 1 0 1 1 1 0 0 1
187   1011 1 0 0 0 0 1 0 1
188   1011 0 1 1 0 1 1 0 1
189   1011 1 1 1 0 0 0 1 1
190   1011 0 0 0 0 1 0 1 1
191   1011 0 0 1 1 0 1 1 1
192   1011 1 1 0 1 1 1 1 1
193   1100 0 0 1 1 0 0 0 0
194   1100 1 1 0 1 1 0 0 0
195   1100 1 1 1 0 0 1 0 0
196   1100 0 0 0 0 1 1 0 0
197   1100 1 0 0 0 0 0 1 0
198   1100 0 1 1 0 1 0 1 0
199   1100 0 1 0 1 0 1 1 0
200   1100 1 0 1 1 1 1 1 0
201   1100 0 1 0 0 0 0 0 1
202   1100 1 0 1 0 1 0 0 1
203   1100 1 0 0 1 0 1 0 1
204   1100 0 1 1 1 1 1 0 1
205   1100 1 1 1 1 0 0 1 1
206   1100 0 0 0 1 1 0 1 1
207   1100 0 0 1 0 0 1 1 1
208   1100 1 1 0 0 1 1 1 1
209   1101 0 1 0 0 0 0 0 0
210   1101 1 0 1 0 1 0 0 0
211   1101 1 0 0 1 0 1 0 0
212   1101 0 1 1 1 1 1 0 0
213   1101 1 1 1 1 0 0 1 0
214   1101 0 0 0 1 1 0 1 0
215   1101 0 0 1 0 0 1 1 0
216   1101 1 1 0 0 1 1 1 0
217   1101 0 0 1 1 0 0 0 1
218   1101 1 1 0 1 1 0 0 1
219   1101 1 1 1 0 0 1 0 1
220   1101 0 0 0 0 1 1 0 1
221   1101 1 0 0 0 0 0 1 1
222   1101 0 1 1 0 1 0 1 1
223   1101 0 1 0 1 0 1 1 1
224   1101 1 0 1 1 1 1 1 1
225   1110 1 0 0 0 0 0 0 0
226   1110 0 1 1 0 1 0 0 0
227   1110 0 1 0 1 0 1 0 0
228   1110 1 0 1 1 1 1 0 0
229   1110 0 0 1 1 0 0 1 0
230   1110 1 1 0 1 1 0 1 0
231   1110 1 1 1 0 0 1 1 0
232   1110 0 0 0 0 1 1 1 0
233   1110 1 1 1 1 0 0 0 1
234   1110 0 0 0 1 1 0 0 1
235   1110 0 0 1 0 0 1 0 1
236   1110 1 1 0 0 1 1 0 1
237   1110 0 1 0 0 0 0 1 1
238   1110 1 0 1 0 1 0 1 1
239   1110 1 0 0 1 0 1 1 1
240   1110 0 1 1 1 1 1 1 1
241   1111 1 1 1 1 0 0 0 0
242   1111 0 0 0 1 1 0 0 0
243   1111 0 0 1 0 0 1 0 0
244   1111 1 1 0 0 1 1 0 0
245   1111 0 1 0 0 0 0 1 0
246   1111 1 0 1 0 1 0 1 0
247   1111 1 0 0 1 0 1 1 0
248   1111 0 1 1 1 1 1 1 0
249   1111 1 0 0 0 0 0 0 1
250   1111 0 1 1 0 1 0 0 1
251   1111 0 1 0 1 0 1 0 1
252   1111 1 0 1 1 1 1 0 1
253   1111 0 0 1 1 0 0 1 1
254   1111 1 1 0 1 1 0 1 1
255   1111 1 1 1 0 0 1 1 1
256   1111 0 0 0 0 1 1 1 1

43.2 Analysis of confounded designs

43.2.1 Speed bump data

These data are adapted from Khademi, et al (2013). They’re trying to figure out where to put speed bumps. The factors are number of passengers (1 or 5), car speed (10 or 30 kph), distance from bump to stop point (10 or 30 m), and surface inclincation (0 or 7 degrees). The response is time to reach the stop point. The experiment was run over two days (pun very much intended), confounding the four factor interaction.

> seconds <- c(3.45,3.36,1.41,1.44,6.21,6.69,3.14,3.22,3.68,3.89,1.76,1.81,8.39,8.31,3.23,3.30)
> pass <- factor(rep(c(1,5),length=16))
> speed <- factor(rep(c(10,30),each=2,length=16))
> dist <- factor(rep(c(10,20),each=4,length=16))
> incl <- factor(rep(c(0,7),each=8,length=16))

Looking at the TwoSeriesPlots(), Basso-Salmaso and PSE agree that everything that does not involve pass is significant at .01. In particular, the four-factor interaction (confounding with blocks) is not a “big” term that we need to ignore.

> bump.lm <- lm(seconds~pass*speed*dist*incl)
> set.seed(12345)
> TwoSeriesPlots(bump.lm,pse=TRUE,alpha=.01)
B set to 13934
Half normal plot of speed bump data

Figure 43.1: Half normal plot of speed bump data

Looking at the residual plot from the model including speed:dist shows nonconstant variance.

> plot(lm(seconds~speed*dist*incl),which=1)
Residual plot for two-way model of speed bump data

Figure 43.2: Residual plot for two-way model of speed bump data

Box-Cox says to use a power between -1 (reciprocal, or average speed until stop) and 0 (log).

> boxCox(lm(seconds~speed*dist*incl))
Box-Cox for two-way model of speed bump data

Figure 43.3: Box-Cox for two-way model of speed bump data

Looking again at the log scale brings a surprise. The speed:dist:incl three-way is significant; hierarchy would say we need to include everything in our model except terms involving pass. (In some simulation runs, only the three main effects and the three-way show up as significant; in other runs, everthing not involving pass is significant.)

> logbump.lm <- lm(log(seconds)~pass*speed*dist*incl)
> TwoSeriesPlots(logbump.lm)
B set to 3000

Here’s what that interaction looks like. Inclination has very little effect when both speed and distance are high, or both speed and distance are low. But it does have an effect when one is high and the other is low.

> interactplot(speed:dist,incl,log(seconds))
Interaction plot for bump data

Figure 43.4: Interaction plot for bump data

43.3 DNPK data

These data are from a \(2^4\) design replicated twice, blocked into four blocks of size 8 with dnpk confounded with blocks in both replicates. dnpk is thus completely confounded. Data are from Cochran and Cox.

Notice that in the first block of each replication, there are always an odd number of factors at the high level (either 1 or 3), whereas in block 2 of each replication there is always an even number of factors at the high level (0, 2, or 4).

> dnpk <- read.table("dnpk.dat.txt",header=TRUE)
> names(dnpk)
[1] "d"     "n"     "p"     "k"     "block" "rpl"   "yield"
> dnpk <- within(dnpk,{d <- factor(d);n <- factor(n);
+     p <- factor(p);k <- factor(k)})
> dnpk <- within(dnpk,{block <- factor(block);rpl<-factor(rpl)})
> dnpk
   d n p k block rpl yield
1  1 1 2 1     1   1    45
2  1 1 1 2     1   1    55
3  2 1 1 1     1   1    53
4  1 2 2 2     1   1    36
5  2 2 1 2     1   1    41
6  2 2 2 1     1   1    48
7  2 1 2 2     1   1    55
8  1 2 1 1     1   1    42
9  2 1 2 1     2   1    50
10 1 2 1 2     2   1    44
11 2 1 1 2     2   1    43
12 1 1 2 2     2   1    51
13 2 2 2 2     2   1    44
14 1 1 1 1     2   1    58
15 2 2 1 1     2   1    41
16 1 2 2 1     2   1    50
17 1 1 2 1     1   2    39
18 1 1 1 2     1   2    50
19 2 1 1 1     1   2    42
20 1 2 2 2     1   2    43
21 2 2 1 2     1   2    34
22 2 2 2 1     1   2    52
23 2 1 2 2     1   2    44
24 1 2 1 1     1   2    47
25 2 1 2 1     2   2    52
26 1 2 1 2     2   2    43
27 2 1 1 2     2   2    52
28 1 1 2 2     2   2    56
29 2 2 2 2     2   2    54
30 1 1 1 1     2   2    57
31 2 2 1 1     2   2    42
32 1 2 2 1     2   2    39

Here are the basic model and ANOVA.
These data were set up with blocks numbered 1 and 2 in each replication, so the replication by block “interaction” actually enumerates all four blocks, with 3 degrees of freedom between the four blocks. We want treatments adjusted for blocks, and we did not quite get it here, because R wants to put two factor terms (in this case, all blocks is a two factor term) after main effects. In this case is does not matter, but in other cases it might. In those cases, we need to use the terms() function to get the terms in the order we want, or we need to make a single factor to enumerate all of the blocks (perhaps using join()).

Note that the four factor interaction dnpk does not even show up in this table. That is because it is confounded with blocks within each replication and has 0 degrees of freedom. It cannot be estimated because it is completely confounded with blocks.

> dnpk.lm <- lm(yield~rpl:block+d*n*p*k,data=dnpk)
> anova(dnpk.lm)
Analysis of Variance Table

Response: yield
          Df Sum Sq Mean Sq F value   Pr(>F)   
d          1   2.00    2.00  0.0824 0.778258   
n          1 325.12  325.12 13.3974 0.002572 **
p          1   6.12    6.12  0.2524 0.623205   
k          1   4.50    4.50  0.1854 0.673303   
rpl:block  3 126.37   42.12  1.7358 0.205538   
d:n        1  32.00   32.00  1.3186 0.270083   
d:p        1 242.00  242.00  9.9720 0.006982 **
n:p        1  78.13   78.13  3.2193 0.094393 . 
d:k        1   6.13    6.13  0.2524 0.623205   
n:k        1  32.00   32.00  1.3186 0.270083   
p:k        1  24.50   24.50  1.0096 0.332058   
d:n:p      1   2.00    2.00  0.0824 0.778258   
d:n:k      1  10.13   10.13  0.4172 0.528774   
d:p:k      1  15.13   15.13  0.6233 0.443007   
n:p:k      1  32.00   32.00  1.3186 0.270083   
Residuals 14 339.75   24.27                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

If you look at the coefficients you will see that we have missing (NA) for the four factor interaction.

> dnpk.lm$coefficients
(Intercept)          d1          n1          p1          k1 rpl1:block1 
    49.3750      0.2500      3.1875     -0.4375      0.3750     -2.5000 
rpl2:block1 rpl1:block2 rpl2:block2       d1:n1       d1:p1       n1:p1 
    -5.5000     -1.7500          NA      1.0000      2.7500      1.5625 
      d1:k1       n1:k1       p1:k1    d1:n1:p1    d1:n1:k1    d1:p1:k1 
    -0.4375     -1.0000      0.8750     -0.2500     -0.5625      0.6875 
   n1:p1:k1 d1:n1:p1:k1 
     1.0000          NA 

Residuals don’t look too bad … maybe decreasing variance?

> plot(dnpk.lm,which=1)
Residual plot of dnpk data

Figure 43.5: Residual plot of dnpk data

> plot(dnpk.lm,which=2)
NPP of residuals for dnpk data

Figure 43.6: NPP of residuals for dnpk data

The ANOVA tells us that n (nitrogren) is significant along with the d:p (dung by phosphorus) interaction (but not the main effects). Of course, hierarchy says keep the main effects. Here is what the interaction looks like. It really is purely interactive.

> interactplot(d,p,yield,data=dnpk,conf=.95,sigma2=24.27,df=14)
d:p interaction plot for dnpk data

Figure 43.7: d:p interaction plot for dnpk data

43.4 Potatoes data

Data from John (1971). A \(2^3\) replicated four times and run in 8 blocks of 4. ABC, AB, AC, and BC are each confounded in one replication. Thus this design is a partial confounding design.

Factors are sulfate of ammonia, sulfate of potash, and nitrogen; response is yield of potatoes in pounds per plot.

> potatoes <- read.table("john.dat.txt",header=TRUE)
> potatoes <- within(potatoes,{a <- factor(a);b <- factor(b);c <- factor(c);block <- factor(block)})
> potatoes
   a b c block yield
1  1 1 1     1   101
2  2 1 2     1   373
3  1 2 2     1   398
4  2 2 1     1   291
5  1 1 2     2   312
6  2 1 1     2   106
7  1 2 1     2   265
8  2 2 2     2   450
9  1 1 1     3   106
10 2 2 1     3   306
11 1 1 2     3   324
12 2 2 2     3   449
13 1 2 1     4   272
14 2 1 1     4    89
15 1 2 2     4   407
16 2 1 2     4   338
17 1 1 1     5    87
18 2 1 2     5   324
19 1 2 1     5   279
20 2 2 2     5   471
21 1 1 2     6   323
22 2 1 1     6   128
23 1 2 2     6   423
24 2 2 1     6   334
25 1 1 1     7   131
26 2 1 1     7   103
27 1 2 2     7   445
28 2 2 2     7   437
29 1 1 2     8   324
30 2 1 2     8   361
31 1 2 1     8   302
32 2 2 1     8   272

The basic model is treatments after blocks. You can estimate and test all terms, because none is completely confounded.

Potash, nitrogen, and their interaction are highly significant (b, c, b:c), ammonia is fairly significant, and its interaction with nitrogen is marginally significant.

> potatoes.lm <- lm(yield~block+a*b*c,data=potatoes)
> anova(potatoes.lm)
Analysis of Variance Table

Response: yield
          Df Sum Sq Mean Sq  F value    Pr(>F)    
block      7   4499     643   2.0147  0.112834    
a          1   3465    3465  10.8624  0.004268 ** 
b          1 161170  161170 505.2090 4.404e-14 ***
c          1 278818  278818 873.9916 4.666e-16 ***
a:b        1     28      28   0.0883  0.769960    
a:c        1   1803    1803   5.6507  0.029457 *  
b:c        1  11528   11528  36.1366 1.402e-05 ***
a:b:c      1     45      45   0.1422  0.710737    
Residuals 17   5423     319                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Variance appears to be increasing, but 1 is well within the Box-Cox interval.

> plot(potatoes.lm,which=1)
Residual plot for potatoes data

Figure 43.8: Residual plot for potatoes data

We can see the effect of partial confounding in the standard errors of the effects. The main effects are never confounded, and they have smaller SEs. The ratio of the variances is .75, which means that each interaction is confounded in one of the four replications.

> summary(potatoes.lm)

Call:
lm(formula = yield ~ block + a * b * c, data = potatoes)

Residuals:
     Min       1Q   Median       3Q      Max 
-25.0938  -9.5469   0.5729   6.4531  28.2396 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  291.594      3.157  92.352  < 2e-16 ***
block1        -2.219      9.115  -0.243  0.81059    
block2        -6.969      9.115  -0.765  0.45501    
block3         3.573      9.115   0.392  0.69993    
block4       -14.010      9.115  -1.537  0.14267    
block5       -10.010      9.115  -1.098  0.28740    
block6        19.073      9.115   2.093  0.05170 .  
block7         9.323      9.115   1.023  0.32072    
a1           -10.406      3.157  -3.296  0.00427 ** 
b1           -70.969      3.157 -22.477 4.40e-14 ***
c1           -93.344      3.157 -29.563 4.67e-16 ***
a1:b1          1.083      3.646   0.297  0.76996    
a1:c1          8.667      3.646   2.377  0.02946 *  
b1:c1        -21.917      3.646  -6.011 1.40e-05 ***
a1:b1:c1       1.375      3.646   0.377  0.71074    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 17.86 on 17 degrees of freedom
Multiple R-squared:  0.9884,    Adjusted R-squared:  0.9788 
F-statistic: 103.3 on 14 and 17 DF,  p-value: 1.311e-13
> 3.157^2/3.646^2
[1] 0.7497489

By fooling around and only using subsets of the data, we can isolate which terms are confounded in which blocks (or we could just look at the data and think a little bit).

We can’t fit a:b:c in the first replication (blocks 1 and 2), so it is confounded there; we can fit everything else. Similarly, we can’t fit b:c in the last replication(blocks 7 and 8), so it is confounded there. And so on.

> lm(yield~block+a*b*c,data=potatoes,subset=as.numeric(block) < 3)

Call:
lm(formula = yield ~ block + a * b * c, data = potatoes, subset = as.numeric(block) < 
    3)

Coefficients:
(Intercept)       block1           a1           b1           c1        a1:b1  
     287.00         3.75       -18.00       -64.00       -96.25         1.50  
      a1:c1        b1:c1     a1:b1:c1  
      10.25       -23.25           NA  
> lm(yield~block+a*b*c,data=potatoes,subset=as.numeric(block) > 6)

Call:
lm(formula = yield ~ block + a * b * c, data = potatoes, subset = as.numeric(block) > 
    6)

Coefficients:
(Intercept)       block1           a1           b1           c1        a1:b1  
    296.875      -17.875        3.625      -67.125      -94.875       -5.875  
      a1:c1        b1:c1     a1:b1:c1  
     10.875           NA        5.375  

43.5 Three series

conf.design() can handle three series designs as well. Here we confound \(A^1B^1\).

> conf.design(c(1,1),3)
  Blocks T1 T2
1      0  0  0
2      0  2  1
3      0  1  2
4      1  1  0
5      1  0  1
6      1  2  2
7      2  2  0
8      2  1  1
9      2  0  2

Something a little bigger? Here we confound a \(3^3\) (27 treatments) into nine blocks of three with \(A^1C^2\) and \(A^1B^1\) as generators.

> big3 <- rbind(c(1,0,2),c(1,1,0))
> big3
     [,1] [,2] [,3]
[1,]    1    0    2
[2,]    1    1    0

The full set of confounded effects (3 df).

> conf.set(big3,3)
     [,1] [,2] [,3]
[1,]    1    0    2
[2,]    1    1    0
[3,]    0    1    1
[4,]    1    2    1

And the design itself.

> conf.design(big3,3)
   Blocks T1 T2 T3
1      00  0  0  0
2      00  1  2  1
3      00  2  1  2
4      01  0  1  0
5      01  1  0  1
6      01  2  2  2
7      02  0  2  0
8      02  1  1  1
9      02  2  0  2
10     10  1  2  0
11     10  2  1  1
12     10  0  0  2
13     11  1  0  0
14     11  2  2  1
15     11  0  1  2
16     12  1  1  0
17     12  2  0  1
18     12  0  2  2
19     20  2  1  0
20     20  0  0  1
21     20  1  2  2
22     21  2  2  0
23     21  0  1  1
24     21  1  0  2
25     22  2  0  0
26     22  0  2  1
27     22  1  1  2