# Here, we do a power analysis for a one-way ANOVA, assuming
# that five means are spread equally over a 2 standard deviation
# range. (See the Powerpoint for an explanation of why the
# noncentrality parameter is 2.5n.) We want to know how large
# a within-group sample size we need to attain power of .9 with
# an alpha level of .05.

> n <- seq(2,15,1)
> dfn <- 4
> dfd <- 5*(n-1)
> fcrit <- qf(.95, dfn, dfd)
> lambda <- 2.5*n
> power <- 1-pf(fcrit, dfn, dfd, lambda)

# The following table shows that we would need 8 cases per cell:

> cbind(n,power)
       n     power
 [1,]  2 0.1899243
 [2,]  3 0.3827862
 [3,]  4 0.5645058
 [4,]  5 0.7110347
 [5,]  6 0.8177333
 [6,]  7 0.8898304
 [7,]  8 0.9357913
 [8,]  9 0.9637413
 [9,] 10 0.9800839
[10,] 11 0.9893256
[11,] 12 0.9944029
[12,] 13 0.9971226
[13,] 14 0.9985470
[14,] 15 0.9992781

# We introduced two-way ANOVA by expanding our math practice example.
# The 24 cases we have been looking at were participants who received
# no reward for practicing, but there are actually another 24 participants
# who received a small reward.

> math2 <- read.csv("http://faculty.ucmerced.edu/jvevea/classes/202a/data/math2.csv")
> math2
   Time Practice Reward
1    51   massed   none
2    96   massed   none
3    97   massed   none
4    22   massed   none
5    35   massed   none
6    36   massed   none
7    28   massed   none
8    76   massed   none
9    39   spaced   none
10  104   spaced   none
11  130   spaced   none
12  122   spaced   none
13  114   spaced   none
14   92   spaced   none
15   87   spaced   none
16   64   spaced   none
17   42     none   none
18   92     none   none
19  156     none   none
20  144     none   none
21  133     none   none
22  124     none   none
23   68     none   none
24  142     none   none
25   26   massed  small
26   41   massed  small
27   28   massed  small
28   92   massed  small
29   14   massed  small
30   16   massed  small
31   29   massed  small
32   31   massed  small
33   41   spaced  small
34   26   spaced  small
35   19   spaced  small
36   59   spaced  small
37   82   spaced  small
38   86   spaced  small
39   45   spaced  small
40   37   spaced  small
41   36     none  small
42   39     none  small
43   59     none  small
44   27     none  small
45   87     none  small
46   99     none  small
47  126     none  small
48  104     none  small

# We looked first at ways to visualize the problem. In a previous
# R session, we had installed the package "sciplot". In order to
# use it within a session, we employ the library() command:

> library(sciplot)
> attach(math2)

# Here's a parallel barplot:

> bargraph.CI(Practice, Time, group=Reward, legend=TRUE)
> abline(h=0)

# Here, we improve the aspect ratio and labeling. We also use
# slanted lines instead of shades of gray, as these would reproduce
# in a print journal without the need for special handling:

> par(pin=c(6,4))
> bargraph.CI(Practice, Time, group=Reward, legend=TRUE,
+  xlab="Practice Condition",ylab="Time",main="Practice and Reward Experiment",
+  col="black",angle=c(-45,45), density=c(15,25)); abline(h=0)

# The choice of which factor is shown across the X-axis and which is
# shown with separate bars can change our impression of what's happening:

> bargraph.CI(Reward, Time, group=Practice, legend=TRUE)

# (Usually, you will want to have the factor with more levels be the
# one that is shown across the X-axis, and the factor with fewer levels
# be shown with separate bars.)

# Sciplot can do interaction plots:

> interaction.plot(Practice, Reward, Time, ylim=c(25,135),type='b')

# Usually, though, you can do a better interaction plot manually. Here,
# we illustrate that process, using the Eysenck two-way ANOVA example:

> Eysenck2 <- read.csv("http://faculty.ucmerced.edu/jvevea/classes/202a/data/Eysenck2.csv")
> Eysenck2
    Words        Group    Age
1       9     Counting    Old
2       8     Counting    Old
3       6     Counting    Old
4       8     Counting    Old
5      10     Counting    Old
6       4     Counting    Old
7       6     Counting    Old
8       5     Counting    Old
9       7     Counting    Old
10      7     Counting    Old
11      7      Rhyming    Old
12      9      Rhyming    Old
13      6      Rhyming    Old
14      6      Rhyming    Old
15      6      Rhyming    Old
16     11      Rhyming    Old
17      6      Rhyming    Old
18      3      Rhyming    Old
19      8      Rhyming    Old
20      7      Rhyming    Old
21     11    Adjective    Old
22     13    Adjective    Old
23      8    Adjective    Old
24      6    Adjective    Old
25     14    Adjective    Old
26     11    Adjective    Old
27     13    Adjective    Old
28     13    Adjective    Old
29     10    Adjective    Old
30     11    Adjective    Old
31     12      Imagery    Old
32     11      Imagery    Old
33     16      Imagery    Old
34     11      Imagery    Old
35      9      Imagery    Old
36     23      Imagery    Old
37     12      Imagery    Old
38     10      Imagery    Old
39     19      Imagery    Old
40     11      Imagery    Old
41     10  Intentional    Old
42     19  Intentional    Old
43     14  Intentional    Old
44      5  Intentional    Old
45     10  Intentional    Old
46     11  Intentional    Old
47     14  Intentional    Old
48     15  Intentional    Old
49     11  Intentional    Old
50     11  Intentional    Old
51      8     Counting  Young
52      6     Counting  Young
53      4     Counting  Young
54      6     Counting  Young
55      7     Counting  Young
56      6     Counting  Young
57      5     Counting  Young
58      7     Counting  Young
59      9     Counting  Young
60      7     Counting  Young
61     10      Rhyming  Young
62      7      Rhyming  Young
63      8      Rhyming  Young
64     10      Rhyming  Young
65      4      Rhyming  Young
66      7      Rhyming  Young
67     10      Rhyming  Young
68      6      Rhyming  Young
69      7      Rhyming  Young
70      7      Rhyming  Young
71     14    Adjective  Young
72     11    Adjective  Young
73     18    Adjective  Young
74     14    Adjective  Young
75     13    Adjective  Young
76     22    Adjective  Young
77     17    Adjective  Young
78     16    Adjective  Young
79     12    Adjective  Young
80     11    Adjective  Young
81     20      Imagery  Young
82     16      Imagery  Young
83     16      Imagery  Young
84     15      Imagery  Young
85     18      Imagery  Young
86     16      Imagery  Young
87     20      Imagery  Young
88     22      Imagery  Young
89     14      Imagery  Young
90     19      Imagery  Young
91     21  Intentional  Young
92     19  Intentional  Young
93     17  Intentional  Young
94     15  Intentional  Young
95     22  Intentional  Young
96     16  Intentional  Young
97     22  Intentional  Young
98     22  Intentional  Young
99     18  Intentional  Young
100    21  Intentional  Young

# We lay out a blank plot:

> plot(c(0.5,6.5),c(3,22), type='n', xlab="Strategy",ylab="Words Recalled",axes=FALSE)
> box()

# Add axis labels:

> axis(side=1,at=1:5,lab=c("Count","Rhyme","Adject","Image","Intent"))
> axis(side=2,at=seq(5,20,5))

# Add a curve for each set of means:

> lines(1:5, c(7,6.9,11,13.4,12), type='b')
> lines(1:5, c(6.5,7.6,14.8,17.6,19.3), type='b')

# Label the two curves:

> text(5.3, 12, "Old", adj=0)
> text(5.3, 19.3, "Young", adj=0)
 
# How do we actually do a two-way ANOVA in R? We use a model
# statement like this:

> lm(Time~Practice+Reward+Practice*Reward)

Call:
lm(formula = Time ~ Practice + Reward + Practice * Reward)

Coefficients:
                  (Intercept)               Practice  massed  
                      112.625                        -57.500  
             Practice  spaced                   Reward small  
                      -18.625                        -40.500  
Practice  massed:Reward small  Practice  spaced:Reward small  
                       20.000                         -4.125  

# But we aren't really interested in the estimates associated
# with a parameterization of the problem; rather, we want F
# tests for the two main effects and the interaction. The anova()
# wrapper gives us those results:

> anova(lm(Time~Practice+Reward+Practice*Reward))
Analysis of Variance Table

Response: Time
                Df Sum Sq Mean Sq F value    Pr(>F)    
Practice         2  18150  9075.0  8.9331 0.0005854 ***
Reward           1  14876 14875.5 14.6428 0.0004254 ***
Practice:Reward  2   1332   666.0  0.6556 0.5243595    
Residuals       42  42667  1015.9                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


# Here's the ANOVA for the two-way Eysenck example:

> attach(Eysenck2)
> anova(lm(Words~Group+Age+Group*Age))

Analysis of Variance Table

Response: Words
          Df  Sum Sq Mean Sq F value    Pr(>F)    
Group      4 1514.94  378.74 47.1911 < 2.2e-16 ***
Age        1  240.25  240.25 29.9356 3.981e-07 ***
Group:Age  4  190.30   47.57  5.9279 0.0002793 ***
Residuals 90  722.30    8.03                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Now, let's think about where the breakdown of sums of squares
# comes from. Here's the math practice example again:

> anova(lm(Time~Practice+Reward+Practice*Reward))
Analysis of Variance Table

Response: Time
                Df Sum Sq Mean Sq F value    Pr(>F)    
Practice         2  18150  9075.0  8.9331 0.0005854 ***
Reward           1  14876 14875.5 14.6428 0.0004254 ***
Practice:Reward  2   1332   666.0  0.6556 0.5243595    
Residuals       42  42667  1015.9                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# The sum of squares for the Reward effect is the same as
# the sum of squares for a one-way ANOVA on the factor, ignoring
# the fact that each Reward condition has three Practice groups:

> anova(lm(Time~Reward))
Analysis of Variance Table

Response: Time
          Df Sum Sq Mean Sq F value   Pr(>F)   
Reward     1  14876 14875.5   11.01 0.001777 **
Residuals 46  62149  1351.1                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Similarly, the sum of squares for the Practice main effect
# is the same as what we would get if we simply did a one-way
# ANOVA on Practice:

> anova(lm(Time~Practice))
Analysis of Variance Table

Response: Time
          Df Sum Sq Mean Sq F value   Pr(>F)   
Practice   2  18150  9075.0  6.9363 0.002367 **
Residuals 45  58875  1308.3                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# But what about the interaction? Here, I set up a "Junk"
# variable that simply identifies the 6 cells of the design:

> Junk <- c(rep('a',8),rep('b',8),rep('c',8),rep('d',8),rep('e',8),rep('f',8))
> cbind(Practice,Reward,Junk)
      Practice Reward Junk
 [1,] "2"      "1"    "a" 
 [2,] "2"      "1"    "a" 
 [3,] "2"      "1"    "a" 
 [4,] "2"      "1"    "a" 
 [5,] "2"      "1"    "a" 
 [6,] "2"      "1"    "a" 
 [7,] "2"      "1"    "a" 
 [8,] "2"      "1"    "a" 
 [9,] "3"      "1"    "b" 
[10,] "3"      "1"    "b" 
[11,] "3"      "1"    "b" 
[12,] "3"      "1"    "b" 
[13,] "3"      "1"    "b" 
[14,] "3"      "1"    "b" 
[15,] "3"      "1"    "b" 
[16,] "3"      "1"    "b" 
[17,] "1"      "1"    "c" 
[18,] "1"      "1"    "c" 
[19,] "1"      "1"    "c" 
[20,] "1"      "1"    "c" 
[21,] "1"      "1"    "c" 
[22,] "1"      "1"    "c" 
[23,] "1"      "1"    "c" 
[24,] "1"      "1"    "c" 
[25,] "2"      "2"    "d" 
[26,] "2"      "2"    "d" 
[27,] "2"      "2"    "d" 
[28,] "2"      "2"    "d" 
[29,] "2"      "2"    "d" 
[30,] "2"      "2"    "d" 
[31,] "2"      "2"    "d" 
[32,] "2"      "2"    "d" 
[33,] "3"      "2"    "e" 
[34,] "3"      "2"    "e" 
[35,] "3"      "2"    "e" 
[36,] "3"      "2"    "e" 
[37,] "3"      "2"    "e" 
[38,] "3"      "2"    "e" 
[39,] "3"      "2"    "e" 
[40,] "3"      "2"    "e" 
[41,] "1"      "2"    "f" 
[42,] "1"      "2"    "f" 
[43,] "1"      "2"    "f" 
[44,] "1"      "2"    "f" 
[45,] "1"      "2"    "f" 
[46,] "1"      "2"    "f" 
[47,] "1"      "2"    "f" 
[48,] "1"      "2"    "f" 

# Here's the ANOVA again:

> anova(lm(Time~Practice+Reward+Practice*Reward))
Analysis of Variance Table

Response: Time
                Df Sum Sq Mean Sq F value    Pr(>F)    
Practice         2  18150  9075.0  8.9331 0.0005854 ***
Reward           1  14876 14875.5 14.6428 0.0004254 ***
Practice:Reward  2   1332   666.0  0.6556 0.5243595    
Residuals       42  42667  1015.9                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# If I do a one-way ANOVA on "Junk" I get the same residual
# sum of squares, but nothing else in the ANOVA table matches:

> anova(lm(Time~Junk))
Analysis of Variance Table

Response: Time
          Df Sum Sq Mean Sq F value    Pr(>F)    
Junk       5  34358  6871.5   6.764 0.0001048 ***
Residuals 42  42667  1015.9                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# That's because variation across the groups is partly attributable
# to the main effecs of Practice and Reward. If we take the model
# sum of squares from the Junk ANOVA and subtract the sums of squares
# for the two main effect, we are left with the interaction sum of squares:

> 34358 - 18150 - 14876
[1] 1332
 
 
# Here, we set up a dummy coding system for the Practice variable:

> d1 <- c(rep(1,8),rep(0,16),rep(1,8),rep(0,16))
> d2 <- c(rep(0,8),rep(1,8),rep(0,8),rep(0,8),rep(1,8),rep(0,8))

# Here's a dummy for the Reward variable:

> rdummy <- c(rep(1,24),rep(0,24))

# The interaction is captured by the products of those two coding
# systems:

> i1 <- d1*rdummy
> i2 <- d2*rdummy

# Note that we get the same residual sum of squares we had before,
# so we must be doing the same ANOVA. Next time, we'll see how to
# separate this into the three tests that interest us:

> anova(lm(Time~d1+d2+rdummy+i1+i2))
Analysis of Variance Table

Response: Time
          Df Sum Sq Mean Sq F value    Pr(>F)    
d1         1  14726 14726.3 14.4959 0.0004507 ***
d2         1   3424  3423.8  3.3702 0.0734709 .  
rdummy     1  14876 14875.5 14.6428 0.0004254 ***
i1         1   1298  1298.0  1.2777 0.2647396    
i2         1     34    34.0  0.0335 0.8556574    
Residuals 42  42667  1015.9                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>