# We illustrated analysis of variance (ANOVA) using simulated data from # an experiment by Eysenck. The idea here is that memory is affected by # level of processing: The more you process something in you brain, the # more likely you are to remember it. # # In this data set, "Score" represents the count of words remembered by # participants in each of five groups. The first group (counting) were # told to count the letters in each word they saw. This represents a minimal # level of processing, so the expectation is that when the participants # are later asked to write down the words they remember, they won't recall # very many. The next group (rhyming) were told to think of a word that # rhymes with each word they see. That represents a somewhat higher level # of processing, so the expectation is that recall should be a bit better. # The next group (adjective) were instructed to think of an adjective that # would appropriately modify the word they see. The highest level of processing # is the imagery group; they were told to construct a mental image of each # word. The people in those four groups weren't told until after they had # seen the words that the would be asked to remember them. The fifth group # had no instructions except that they were told to try to remember the words. # Here are the data: > Eysenck <- read.csv("http://faculty.ucmerced.edu/jvevea/classes/105/data/Eysenck.csv") > Eysenck Score Group 1 9 counting 2 8 counting 3 6 counting 4 8 counting 5 10 counting 6 4 counting 7 6 counting 8 5 counting 9 7 counting 10 7 counting 11 7 rhyming 12 9 rhyming 13 6 rhyming 14 6 rhyming 15 6 rhyming 16 11 rhyming 17 6 rhyming 18 3 rhyming 19 8 rhyming 20 7 rhyming 21 11 adjective 22 13 adjective 23 8 adjective 24 6 adjective 25 14 adjective 26 11 adjective 27 13 adjective 28 13 adjective 29 10 adjective 30 11 adjective 31 12 imagery 32 11 imagery 33 16 imagery 34 11 imagery 35 9 imagery 36 23 imagery 37 12 imagery 38 10 imagery 39 19 imagery 40 11 imagery 41 10 intent 42 19 intent 43 14 intent 44 5 intent 45 10 intent 46 11 intent 47 14 intent 48 15 intent 49 11 intent 50 11 intent > attach(Eysenck) # The Powerpoint for today mentions two different ways we can # use this information to estimate the population variance. The # first way, analogous to the pooled variance in the t test, # simply calculates a weighted average of the variances in the # five groups. Here are the variances: > tapply(Score, Group, var) adjective counting imagery intent rhyming 6.222222 3.333333 20.266667 14.000000 4.544444 # In this example, n=10 in all five groups, so the weighted # average is the same as the simple average: > mean(tapply(Score, Group, var)) [1] 9.673333 # That variance estimate, known as the within-groups mean square, # makes sense even if the null hypothesis (that all five population # means are equal) is false. > mean(tapply(Score, Group, var)) -> MSw > MSw [1] 9.673333 # The second way to estimate variability is to use an algebraic # manipulation of what the central limit theorem says about the # variability of sample means. (See Powerpoint slides 5 and 6 from # today.) This way of estimating the variance is a reasonable approach # if the null hypothesis is true, but will be too big if the population # means are NOT all the same: > 10 * var(tapply(Score, Group, mean)) -> MSb > MSb [1] 87.88 # The estimate that depends on the null hypothesis is more than nine # times as big as the pooled estimate: > MSb/MSw [1] 9.084769 # Given that some assumptions are met and the null hypothesis is # true, that statistic follows an F distribution. The degrees of # freedom for the denominator equal (n-1) for each group times # the number of groups: > 5*9 [1] 45 # The df for the numerator is the number of groups minus 1 (so 4 # in this example, because there are five groups). # We calculate the p-value for this F statistic and find that there # is significant evidence that the null hypothesis is false: it is # not the case that all five population means are equal. > F <- MSb/MSw > F [1] 9.084769 > 1 - pf(F, 4, 45) [1] 1.81549e-05 # Here are the sample means from this analysis: > tapply(Score, Group, mean) adjective counting imagery intent rhyming 11.0 7.0 13.4 12.0 6.9 # Here's one way to visually compare the groups: > boxplot(Score~Group) # And here's the easy way to get the ANOVA in R. Note # the the mean squares, F statistic, and p-value match # what we calculated the hard way: > anova(lm(Score~Group)) Analysis of Variance Table Response: Score Df Sum Sq Mean Sq F value Pr(>F) Group 4 351.52 87.880 9.0848 1.815e-05 *** Residuals 45 435.30 9.673 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # We noted that an ANOVA can be calculated with only two # groups, in which case it is exactly equivalent to the # t test: > attach(JackStatlab) > t.test(CTPEA~CBSEX, var.equal=TRUE) Two Sample t-test data: CTPEA by CBSEX t = -2.8494, df = 48, p-value = 0.006434 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -17.373374 -2.998421 sample estimates: mean in group 0 mean in group 1 76.73077 86.91667 > anova(lm(CTPEA~CBSEX)) Analysis of Variance Table Response: CTPEA Df Sum Sq Mean Sq F value Pr(>F) CBSEX 1 1294.8 1294.83 8.1192 0.006434 ** Residuals 48 7654.9 159.48 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # Note that the p-values for those two analyses are # identical. In fact, the F with 1 df in the numerator # is just the square of the t (the discrepancy is just # rounding error): > -2.8494^2 [1] -8.11908 # Returning to the Eysenck experiment, we consider the # assumptions of the ANOVA. (See slide 7 from today for # a list of the assumptions.) # We can assess the assumption of equal variances... > tapply(Score, Group, var) adjective counting imagery intent rhyming 6.222222 3.333333 20.266667 14.000000 4.544444 # ...by comparing the standard deviations. (Remember, the # squared metric of the variances exaggerates the differences.) # The sd for the imagery group appears large compared to the # others: > tapply(Score, Group, sd) adjective counting imagery intent rhyming 2.494438 1.825742 4.501851 3.741657 2.131770 # We can assess whether these are differences we should take # seriously by simulating draws of n=10 from a normal distribution # with sd equal to the square root of the MSw. Here's the ANOVA # again: > anova(lm(Score~Group)) Analysis of Variance Table Response: Score Df Sum Sq Mean Sq F value Pr(>F) Group 4 351.52 87.880 9.0848 1.815e-05 *** Residuals 45 435.30 9.673 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # So a pooled estimate of the sd would be > sqrt(9.673) [1] 3.110145 # If I simulate several draws from a normal distribution # with that standard deviation, I do get estimates that # are higher than 4.5 (the largest standard deviation in # the experiment), but I don't get any as small as the # smallest, so I might be mildly concerned about the # assumption of equal variances in the population: > sd( rnorm(10, 0, 3.11) ) [1] 3.464444 > sd( rnorm(10, 0, 3.11) ) [1] 2.201169 > sd( rnorm(10, 0, 3.11) ) [1] 2.257251 > sd( rnorm(10, 0, 3.11) ) [1] 2.993864 > sd( rnorm(10, 0, 3.11) ) [1] 2.564385 > sd( rnorm(10, 0, 3.11) ) [1] 4.378201 > sd( rnorm(10, 0, 3.11) ) [1] 3.046242 > sd( rnorm(10, 0, 3.11) ) [1] 3.342231 > sd( rnorm(10, 0, 3.11) ) [1] 5.678103 > sd( rnorm(10, 0, 3.11) ) [1] 3.417077 > sd( rnorm(10, 0, 3.11) ) [1] 2.866137 > sd( rnorm(10, 0, 3.11) ) [1] 3.097997 > sd( rnorm(10, 0, 3.11) ) [1] 2.743865 # Here, we assess the normality of each group (inasmuch as # it is possible to do so with only 10 observations). # Things look pretty good. > qqnorm(Score[1:10]); qqline(Score[1:10]) > qqnorm(Score[11:20]); qqline(Score[11:20]) > qqnorm(Score[21:30]); qqline(Score[21:30]) > qqnorm(Score[31:40]); qqline(Score[31:40]) > qqnorm(Score[41:50]); qqline(Score[41:50]) >