# We illustrated analysis of variance (ANOVA) using simulated data from
# an experiment by Eysenck. The idea here is that memory is affected by
# level of processing: The more you process something in you brain, the
# more likely you are to remember it.
#
# In this data set, "Score" represents the count of words remembered by
# participants in each of five groups. The first group (counting) were
# told to count the letters in each word they saw. This represents a minimal
# level of processing, so the expectation is that when the participants 
# are later asked to write down the words they remember, they won't recall
# very many. The next group (rhyming) were told to think of a word that
# rhymes with each word they see. That represents a somewhat higher level
# of processing, so the expectation is that recall should be a bit better.
# The next group (adjective) were instructed to think of an adjective that
# would appropriately modify the word they see. The highest level of processing
# is the imagery group; they were told to construct a mental image of each
# word. The people in those four groups weren't told until after they had
# seen the words that the would be asked to remember them. The fifth group
# had no instructions except that they were told to try to remember the words.

# Here are the data:

> Eysenck <- read.csv("http://faculty.ucmerced.edu/jvevea/classes/105/data/Eysenck.csv")
> Eysenck
   Score     Group
1      9  counting
2      8  counting
3      6  counting
4      8  counting
5     10  counting
6      4  counting
7      6  counting
8      5  counting
9      7  counting
10     7  counting
11     7   rhyming
12     9   rhyming
13     6   rhyming
14     6   rhyming
15     6   rhyming
16    11   rhyming
17     6   rhyming
18     3   rhyming
19     8   rhyming
20     7   rhyming
21    11 adjective
22    13 adjective
23     8 adjective
24     6 adjective
25    14 adjective
26    11 adjective
27    13 adjective
28    13 adjective
29    10 adjective
30    11 adjective
31    12   imagery
32    11   imagery
33    16   imagery
34    11   imagery
35     9   imagery
36    23   imagery
37    12   imagery
38    10   imagery
39    19   imagery
40    11   imagery
41    10    intent
42    19    intent
43    14    intent
44     5    intent
45    10    intent
46    11    intent
47    14    intent
48    15    intent
49    11    intent
50    11    intent
> attach(Eysenck)

# The Powerpoint for today mentions two different ways we can
# use this information to estimate the population variance. The
# first way, analogous to the pooled variance in the t test,
# simply calculates a weighted average of the variances in the
# five groups. Here are the variances:

> tapply(Score, Group, var)
adjective  counting   imagery    intent   rhyming 
 6.222222  3.333333 20.266667 14.000000  4.544444 

# In this example, n=10 in all five groups, so the weighted
# average is the same as the simple average:

> mean(tapply(Score, Group, var))
[1] 9.673333

# That variance estimate, known as the within-groups mean square,
# makes sense even if the null hypothesis (that all five population
# means are equal) is false.

> mean(tapply(Score, Group, var)) -> MSw
> MSw
[1] 9.673333

# The second way to estimate variability is to use an algebraic
# manipulation of what the central limit theorem says about the
# variability of sample means. (See Powerpoint slides 5 and 6 from
# today.) This way of estimating the variance is a reasonable approach
# if the null hypothesis is true, but will be too big if the population
# means are NOT all the same:

> 10 * var(tapply(Score, Group, mean)) -> MSb
> MSb
[1] 87.88

# The estimate that depends on the null hypothesis is more than nine
# times as big as the pooled estimate:

> MSb/MSw
[1] 9.084769

# Given that some assumptions are met and the null hypothesis is
# true, that statistic follows an F distribution. The degrees of
# freedom for the denominator equal (n-1) for each group times
# the number of groups:

> 5*9
[1] 45

# The df for the numerator is the number of groups minus 1 (so 4
# in this example, because there are five groups).

# We calculate the p-value for this F statistic and find that there
# is significant evidence that the null hypothesis is false: it is
# not the case that all five population means are equal.

> F <- MSb/MSw
> F
[1] 9.084769
> 1 - pf(F, 4, 45)
[1] 1.81549e-05

# Here are the sample means from this analysis:

> tapply(Score, Group, mean)
adjective  counting   imagery    intent   rhyming 
     11.0       7.0      13.4      12.0       6.9 

# Here's one way to visually compare the groups:

> boxplot(Score~Group)
 
# And here's the easy way to get the ANOVA in R. Note
# the the mean squares, F statistic, and p-value match
# what we calculated the hard way:
 
> anova(lm(Score~Group))
Analysis of Variance Table

Response: Score
          Df Sum Sq Mean Sq F value    Pr(>F)    
Group      4 351.52  87.880  9.0848 1.815e-05 ***
Residuals 45 435.30   9.673                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
 
# We noted that an ANOVA can be calculated with only two
# groups, in which case it is exactly equivalent to the
# t test:

> attach(JackStatlab)
> t.test(CTPEA~CBSEX, var.equal=TRUE)

        Two Sample t-test

data:  CTPEA by CBSEX
t = -2.8494, df = 48, p-value = 0.006434
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -17.373374  -2.998421
sample estimates:
mean in group 0 mean in group 1 
       76.73077        86.91667 

> anova(lm(CTPEA~CBSEX))
Analysis of Variance Table

Response: CTPEA
          Df Sum Sq Mean Sq F value   Pr(>F)   
CBSEX      1 1294.8 1294.83  8.1192 0.006434 **
Residuals 48 7654.9  159.48                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Note that the p-values for those two analyses are
# identical. In fact, the F with 1 df in the numerator
# is just the square of the t (the discrepancy is just
# rounding error):

> -2.8494^2
[1] -8.11908
 
# Returning to the Eysenck experiment, we consider the
# assumptions of the ANOVA. (See slide 7 from today for
# a list of the assumptions.)

# We can assess the assumption of equal variances...
 
> tapply(Score, Group, var)
adjective  counting   imagery    intent   rhyming 
 6.222222  3.333333 20.266667 14.000000  4.544444 

# ...by comparing the standard deviations. (Remember, the
# squared metric of the variances exaggerates the differences.)
# The sd for the imagery group appears large compared to the
# others:

> tapply(Score, Group, sd)
adjective  counting   imagery    intent   rhyming 
 2.494438  1.825742  4.501851  3.741657  2.131770 
 
# We can assess whether these are differences we should take
# seriously by simulating draws of n=10 from a normal distribution
# with sd equal to the square root of the MSw. Here's the ANOVA
# again:

> anova(lm(Score~Group))
Analysis of Variance Table

Response: Score
          Df Sum Sq Mean Sq F value    Pr(>F)    
Group      4 351.52  87.880  9.0848 1.815e-05 ***
Residuals 45 435.30   9.673                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# So a pooled estimate of the sd would be

> sqrt(9.673)
[1] 3.110145

# If I simulate several draws from a normal distribution
# with that standard deviation, I do get estimates that
# are higher than 4.5 (the largest standard deviation in
# the experiment), but I don't get any as small as the 
# smallest, so I might be mildly concerned about the
# assumption of equal variances in the population:

> sd( rnorm(10, 0, 3.11) )
[1] 3.464444
> sd( rnorm(10, 0, 3.11) )
[1] 2.201169
> sd( rnorm(10, 0, 3.11) )
[1] 2.257251
> sd( rnorm(10, 0, 3.11) )
[1] 2.993864
> sd( rnorm(10, 0, 3.11) )
[1] 2.564385
> sd( rnorm(10, 0, 3.11) )
[1] 4.378201
> sd( rnorm(10, 0, 3.11) )
[1] 3.046242
> sd( rnorm(10, 0, 3.11) )
[1] 3.342231
> sd( rnorm(10, 0, 3.11) )
[1] 5.678103
> sd( rnorm(10, 0, 3.11) )
[1] 3.417077
> sd( rnorm(10, 0, 3.11) )
[1] 2.866137
> sd( rnorm(10, 0, 3.11) )
[1] 3.097997
> sd( rnorm(10, 0, 3.11) )
[1] 2.743865
 
# Here, we assess the normality of each group (inasmuch as
# it is possible to do so with only 10 observations).
# Things look pretty good.
 
> qqnorm(Score[1:10]); qqline(Score[1:10])
> qqnorm(Score[11:20]); qqline(Score[11:20])
> qqnorm(Score[21:30]); qqline(Score[21:30])
> qqnorm(Score[31:40]); qqline(Score[31:40])
> qqnorm(Score[41:50]); qqline(Score[41:50])
>