# Here, we do a power analysis for a one-way ANOVA, assuming # that five means are spread equally over a 2 standard deviation # range. (See the Powerpoint for an explanation of why the # noncentrality parameter is 2.5n.) We want to know how large # a within-group sample size we need to attain power of .9 with # an alpha level of .05. > n <- seq(2,15,1) > dfn <- 4 > dfd <- 5*(n-1) > fcrit <- qf(.95, dfn, dfd) > lambda <- 2.5*n > power <- 1-pf(fcrit, dfn, dfd, lambda) # The following table shows that we would need 8 cases per cell: > cbind(n,power) n power [1,] 2 0.1899243 [2,] 3 0.3827862 [3,] 4 0.5645058 [4,] 5 0.7110347 [5,] 6 0.8177333 [6,] 7 0.8898304 [7,] 8 0.9357913 [8,] 9 0.9637413 [9,] 10 0.9800839 [10,] 11 0.9893256 [11,] 12 0.9944029 [12,] 13 0.9971226 [13,] 14 0.9985470 [14,] 15 0.9992781 # We introduced two-way ANOVA by expanding our math practice example. # The 24 cases we have been looking at were participants who received # no reward for practicing, but there are actually another 24 participants # who received a small reward. > math2 <- read.csv("http://faculty.ucmerced.edu/jvevea/classes/202a/data/math2.csv") > math2 Time Practice Reward 1 51 massed none 2 96 massed none 3 97 massed none 4 22 massed none 5 35 massed none 6 36 massed none 7 28 massed none 8 76 massed none 9 39 spaced none 10 104 spaced none 11 130 spaced none 12 122 spaced none 13 114 spaced none 14 92 spaced none 15 87 spaced none 16 64 spaced none 17 42 none none 18 92 none none 19 156 none none 20 144 none none 21 133 none none 22 124 none none 23 68 none none 24 142 none none 25 26 massed small 26 41 massed small 27 28 massed small 28 92 massed small 29 14 massed small 30 16 massed small 31 29 massed small 32 31 massed small 33 41 spaced small 34 26 spaced small 35 19 spaced small 36 59 spaced small 37 82 spaced small 38 86 spaced small 39 45 spaced small 40 37 spaced small 41 36 none small 42 39 none small 43 59 none small 44 27 none small 45 87 none small 46 99 none small 47 126 none small 48 104 none small # We looked first at ways to visualize the problem. In a previous # R session, we had installed the package "sciplot". In order to # use it within a session, we employ the library() command: > library(sciplot) > attach(math2) # Here's a parallel barplot: > bargraph.CI(Practice, Time, group=Reward, legend=TRUE) > abline(h=0) # Here, we improve the aspect ratio and labeling. We also use # slanted lines instead of shades of gray, as these would reproduce # in a print journal without the need for special handling: > par(pin=c(6,4)) > bargraph.CI(Practice, Time, group=Reward, legend=TRUE, + xlab="Practice Condition",ylab="Time",main="Practice and Reward Experiment", + col="black",angle=c(-45,45), density=c(15,25)); abline(h=0) # The choice of which factor is shown across the X-axis and which is # shown with separate bars can change our impression of what's happening: > bargraph.CI(Reward, Time, group=Practice, legend=TRUE) # (Usually, you will want to have the factor with more levels be the # one that is shown across the X-axis, and the factor with fewer levels # be shown with separate bars.) # Sciplot can do interaction plots: > interaction.plot(Practice, Reward, Time, ylim=c(25,135),type='b') # Usually, though, you can do a better interaction plot manually. Here, # we illustrate that process, using the Eysenck two-way ANOVA example: > Eysenck2 <- read.csv("http://faculty.ucmerced.edu/jvevea/classes/202a/data/Eysenck2.csv") > Eysenck2 Words Group Age 1 9 Counting Old 2 8 Counting Old 3 6 Counting Old 4 8 Counting Old 5 10 Counting Old 6 4 Counting Old 7 6 Counting Old 8 5 Counting Old 9 7 Counting Old 10 7 Counting Old 11 7 Rhyming Old 12 9 Rhyming Old 13 6 Rhyming Old 14 6 Rhyming Old 15 6 Rhyming Old 16 11 Rhyming Old 17 6 Rhyming Old 18 3 Rhyming Old 19 8 Rhyming Old 20 7 Rhyming Old 21 11 Adjective Old 22 13 Adjective Old 23 8 Adjective Old 24 6 Adjective Old 25 14 Adjective Old 26 11 Adjective Old 27 13 Adjective Old 28 13 Adjective Old 29 10 Adjective Old 30 11 Adjective Old 31 12 Imagery Old 32 11 Imagery Old 33 16 Imagery Old 34 11 Imagery Old 35 9 Imagery Old 36 23 Imagery Old 37 12 Imagery Old 38 10 Imagery Old 39 19 Imagery Old 40 11 Imagery Old 41 10 Intentional Old 42 19 Intentional Old 43 14 Intentional Old 44 5 Intentional Old 45 10 Intentional Old 46 11 Intentional Old 47 14 Intentional Old 48 15 Intentional Old 49 11 Intentional Old 50 11 Intentional Old 51 8 Counting Young 52 6 Counting Young 53 4 Counting Young 54 6 Counting Young 55 7 Counting Young 56 6 Counting Young 57 5 Counting Young 58 7 Counting Young 59 9 Counting Young 60 7 Counting Young 61 10 Rhyming Young 62 7 Rhyming Young 63 8 Rhyming Young 64 10 Rhyming Young 65 4 Rhyming Young 66 7 Rhyming Young 67 10 Rhyming Young 68 6 Rhyming Young 69 7 Rhyming Young 70 7 Rhyming Young 71 14 Adjective Young 72 11 Adjective Young 73 18 Adjective Young 74 14 Adjective Young 75 13 Adjective Young 76 22 Adjective Young 77 17 Adjective Young 78 16 Adjective Young 79 12 Adjective Young 80 11 Adjective Young 81 20 Imagery Young 82 16 Imagery Young 83 16 Imagery Young 84 15 Imagery Young 85 18 Imagery Young 86 16 Imagery Young 87 20 Imagery Young 88 22 Imagery Young 89 14 Imagery Young 90 19 Imagery Young 91 21 Intentional Young 92 19 Intentional Young 93 17 Intentional Young 94 15 Intentional Young 95 22 Intentional Young 96 16 Intentional Young 97 22 Intentional Young 98 22 Intentional Young 99 18 Intentional Young 100 21 Intentional Young # We lay out a blank plot: > plot(c(0.5,6.5),c(3,22), type='n', xlab="Strategy",ylab="Words Recalled",axes=FALSE) > box() # Add axis labels: > axis(side=1,at=1:5,lab=c("Count","Rhyme","Adject","Image","Intent")) > axis(side=2,at=seq(5,20,5)) # Add a curve for each set of means: > lines(1:5, c(7,6.9,11,13.4,12), type='b') > lines(1:5, c(6.5,7.6,14.8,17.6,19.3), type='b') # Label the two curves: > text(5.3, 12, "Old", adj=0) > text(5.3, 19.3, "Young", adj=0) # How do we actually do a two-way ANOVA in R? We use a model # statement like this: > lm(Time~Practice+Reward+Practice*Reward) Call: lm(formula = Time ~ Practice + Reward + Practice * Reward) Coefficients: (Intercept) Practice massed 112.625 -57.500 Practice spaced Reward small -18.625 -40.500 Practice massed:Reward small Practice spaced:Reward small 20.000 -4.125 # But we aren't really interested in the estimates associated # with a parameterization of the problem; rather, we want F # tests for the two main effects and the interaction. The anova() # wrapper gives us those results: > anova(lm(Time~Practice+Reward+Practice*Reward)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Practice 2 18150 9075.0 8.9331 0.0005854 *** Reward 1 14876 14875.5 14.6428 0.0004254 *** Practice:Reward 2 1332 666.0 0.6556 0.5243595 Residuals 42 42667 1015.9 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # Here's the ANOVA for the two-way Eysenck example: > attach(Eysenck2) > anova(lm(Words~Group+Age+Group*Age)) Analysis of Variance Table Response: Words Df Sum Sq Mean Sq F value Pr(>F) Group 4 1514.94 378.74 47.1911 < 2.2e-16 *** Age 1 240.25 240.25 29.9356 3.981e-07 *** Group:Age 4 190.30 47.57 5.9279 0.0002793 *** Residuals 90 722.30 8.03 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # Now, let's think about where the breakdown of sums of squares # comes from. Here's the math practice example again: > anova(lm(Time~Practice+Reward+Practice*Reward)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Practice 2 18150 9075.0 8.9331 0.0005854 *** Reward 1 14876 14875.5 14.6428 0.0004254 *** Practice:Reward 2 1332 666.0 0.6556 0.5243595 Residuals 42 42667 1015.9 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # The sum of squares for the Reward effect is the same as # the sum of squares for a one-way ANOVA on the factor, ignoring # the fact that each Reward condition has three Practice groups: > anova(lm(Time~Reward)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Reward 1 14876 14875.5 11.01 0.001777 ** Residuals 46 62149 1351.1 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # Similarly, the sum of squares for the Practice main effect # is the same as what we would get if we simply did a one-way # ANOVA on Practice: > anova(lm(Time~Practice)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Practice 2 18150 9075.0 6.9363 0.002367 ** Residuals 45 58875 1308.3 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # But what about the interaction? Here, I set up a "Junk" # variable that simply identifies the 6 cells of the design: > Junk <- c(rep('a',8),rep('b',8),rep('c',8),rep('d',8),rep('e',8),rep('f',8)) > cbind(Practice,Reward,Junk) Practice Reward Junk [1,] "2" "1" "a" [2,] "2" "1" "a" [3,] "2" "1" "a" [4,] "2" "1" "a" [5,] "2" "1" "a" [6,] "2" "1" "a" [7,] "2" "1" "a" [8,] "2" "1" "a" [9,] "3" "1" "b" [10,] "3" "1" "b" [11,] "3" "1" "b" [12,] "3" "1" "b" [13,] "3" "1" "b" [14,] "3" "1" "b" [15,] "3" "1" "b" [16,] "3" "1" "b" [17,] "1" "1" "c" [18,] "1" "1" "c" [19,] "1" "1" "c" [20,] "1" "1" "c" [21,] "1" "1" "c" [22,] "1" "1" "c" [23,] "1" "1" "c" [24,] "1" "1" "c" [25,] "2" "2" "d" [26,] "2" "2" "d" [27,] "2" "2" "d" [28,] "2" "2" "d" [29,] "2" "2" "d" [30,] "2" "2" "d" [31,] "2" "2" "d" [32,] "2" "2" "d" [33,] "3" "2" "e" [34,] "3" "2" "e" [35,] "3" "2" "e" [36,] "3" "2" "e" [37,] "3" "2" "e" [38,] "3" "2" "e" [39,] "3" "2" "e" [40,] "3" "2" "e" [41,] "1" "2" "f" [42,] "1" "2" "f" [43,] "1" "2" "f" [44,] "1" "2" "f" [45,] "1" "2" "f" [46,] "1" "2" "f" [47,] "1" "2" "f" [48,] "1" "2" "f" # Here's the ANOVA again: > anova(lm(Time~Practice+Reward+Practice*Reward)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Practice 2 18150 9075.0 8.9331 0.0005854 *** Reward 1 14876 14875.5 14.6428 0.0004254 *** Practice:Reward 2 1332 666.0 0.6556 0.5243595 Residuals 42 42667 1015.9 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # If I do a one-way ANOVA on "Junk" I get the same residual # sum of squares, but nothing else in the ANOVA table matches: > anova(lm(Time~Junk)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Junk 5 34358 6871.5 6.764 0.0001048 *** Residuals 42 42667 1015.9 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # That's because variation across the groups is partly attributable # to the main effecs of Practice and Reward. If we take the model # sum of squares from the Junk ANOVA and subtract the sums of squares # for the two main effect, we are left with the interaction sum of squares: > 34358 - 18150 - 14876 [1] 1332 # Here, we set up a dummy coding system for the Practice variable: > d1 <- c(rep(1,8),rep(0,16),rep(1,8),rep(0,16)) > d2 <- c(rep(0,8),rep(1,8),rep(0,8),rep(0,8),rep(1,8),rep(0,8)) # Here's a dummy for the Reward variable: > rdummy <- c(rep(1,24),rep(0,24)) # The interaction is captured by the products of those two coding # systems: > i1 <- d1*rdummy > i2 <- d2*rdummy # Note that we get the same residual sum of squares we had before, # so we must be doing the same ANOVA. Next time, we'll see how to # separate this into the three tests that interest us: > anova(lm(Time~d1+d2+rdummy+i1+i2)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) d1 1 14726 14726.3 14.4959 0.0004507 *** d2 1 3424 3423.8 3.3702 0.0734709 . rdummy 1 14876 14875.5 14.6428 0.0004254 *** i1 1 1298 1298.0 1.2777 0.2647396 i2 1 34 34.0 0.0335 0.8556574 Residuals 42 42667 1015.9 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >