# We reviewed the dummy coding system we had created for # the math practice ANOVA: > MassedDummy [1] 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > SpacedDummy [1] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 > attach(math) # Here are the ANOVA results letting R choose a way to # code Practice. Note that R just gives us an overall, # 2 df test of the whole model: > anova(lm(Time~Practice)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) Practice 2 13772 6885.9 5.8504 0.009559 ** Residuals 21 24717 1177.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # When we do the same thing using our dummy variables, # R doesn't know that this is an ANOVA as opposed to an # ordinary regression, so it tries to give us an F test # for each predictor rather than an overall F test: > anova(lm(Time~MassedDummy+SpacedDummy)) Analysis of Variance Table Response: Time Df Sum Sq Mean Sq F value Pr(>F) MassedDummy 1 12384.2 12384.2 10.5219 0.003888 ** SpacedDummy 1 1387.6 1387.6 1.1789 0.289885 Residuals 21 24716.8 1177.0 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # If we apply the summary() wrapper to the model in which R # chose the parameterization for practice, we DO see separate # coefficients; but we need to remember that under ANOVA rules, # we can't interpret the tests, and we don't really know what # the coefficients mean, because we don't know how R parameterized # the problem: > summary(lm(Time~Practice)) Call: lm(formula = Time ~ Practice) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 55.12 12.13 4.545 0.000177 *** Practicenone 57.50 17.15 3.352 0.003019 ** Practicespaced 38.88 17.15 2.266 0.034122 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 # We can get the overall F test from our dummy coded # version by looking at the bottome of the summary() # output, but again, we should not interpret the tests # on the slopes. > summary(lm(Time~MassedDummy+SpacedDummy)) Call: lm(formula = Time ~ MassedDummy + SpacedDummy) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 112.62 12.13 9.285 6.98e-09 *** MassedDummy -57.50 17.15 -3.352 0.00302 ** SpacedDummy -18.62 17.15 -1.086 0.28989 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 # Previously, we dummy coded Massed and Spaced practice, # and let the None group be identified by 0, 0 on those # two variables. We could just as well have coded the None # group, and ignored one of the other dummy variables: > NoneDummy <- c(rep(0,16),rep(1,8)) > NoneDummy [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 # We can get the same ANOVA with any two of our three dummy # variables: > summary(lm(Time~MassedDummy+NoneDummy)) Call: lm(formula = Time ~ MassedDummy + NoneDummy) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 94.00 12.13 7.750 1.37e-07 *** MassedDummy -38.87 17.15 -2.266 0.0341 * NoneDummy 18.62 17.15 1.086 0.2899 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 # Here's another way to do it. Note that this matches the # output from the model where we let R do the coding, so now # we know how R parameterized the problem: > summary(lm(Time~SpacedDummy+NoneDummy)) Call: lm(formula = Time ~ SpacedDummy + NoneDummy) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 55.12 12.13 4.545 0.000177 *** SpacedDummy 38.87 17.15 2.266 0.034122 * NoneDummy 57.50 17.15 3.352 0.003019 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 # Regardless of which coding system we use, we can recover # the means (with a little rounding error) by calculating the # conditional means predicted by the regression model: > tapply(Time, Practice, mean) massed none spaced 55.125 112.625 94.000 > 55.125 + 38.87 [1] 93.995 > 55.125 + 57.5 [1] 112.625 # Another coding system that can be interesting is called # "effects coding." We can create an effects coding system # by replacing the 0's with -1's in the group that has zeros # for the dummy variables (here, the None group): > e1 <- c(rep(1,8),rep(0,8),rep(-1,8)) > e1 [1] 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 > e2 <- c(rep(0,8),rep(1,8),rep(-1,8)) > e2 [1] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 # If we regress on e1 and e2, we get the same ANOVA, but this # time, the intercept is the grand mean (i.e., the mean of all # 24 observations without regard to group membership), and the # slopes represent the "effect" of being in a particular group: > summary(lm(Time~e1+e2)) Call: lm(formula = Time ~ e1 + e2) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 87.250 7.003 12.459 3.63e-11 *** e1 -32.125 9.904 -3.244 0.00389 ** e2 6.750 9.904 0.682 0.50296 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 # As stated above, the intercept is the grand mean: > mean(Time) [1] 87.25 # The "effect" of being in the Massed group is -32.125: > 87.25 -32.125 [1] 55.125 # And the effect of being in the Spaced group is 6.75: > 87.25 + 6.75 [1] 94 # There isn't a slope that represents the effect of being # in the None group, but if we wanted to know that, we could # simply multiply the sum of the other effects by -1: > -(-32.125+6.75) [1] 25.375 > 87.25 + 25.375 [1] 112.625 # Just as was true for the t test, ANY coding system that uses # as many variables as the number of groups minus 1 to identify # each group by a unique pattern will accomplish the same ANOVA # (although there is no particular reason we would want to use # an arbitrary coding like this one): > nonsense1 <- c(rep(32,8),rep(2389,8),rep(-6,8)) > nonsense2 <- c(rep(pi,8),rep(0,8),rep(11,8)) > nonsense1 [1] 32 32 32 32 32 32 32 32 2389 2389 2389 2389 2389 2389 2389 [16] 2389 -6 -6 -6 -6 -6 -6 -6 -6 > nonsense2 [1] 3.141593 3.141593 3.141593 3.141593 3.141593 3.141593 3.141593 [8] 3.141593 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 [15] 0.000000 0.000000 11.000000 11.000000 11.000000 11.000000 11.000000 [22] 11.000000 11.000000 11.000000 > summary(lm(Time~nonsense1+nonsense2)) Call: lm(formula = Time ~ nonesense1 + nonesense2) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 30.891327 18.014850 1.715 0.10111 nonesense1 0.026416 0.009148 2.888 0.00881 ** nonesense2 7.444743 2.214921 3.361 0.00296 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 # In response to a question from class, I showed a table of the # dummy and nonsense coding systems so that we could see that # each group is identified by a unique pattern. The Massed # group has the values 1, 0 in the dummy coding system; the # Spaced group has the values 0, 1; and the None group has the # values 0, 0. Because each group has a unique pattern, the # regression will result in the ANOVA. For the nonsense coding # system, Massed is identified by the values 32, pi; Spaced is # identified by 2389, 0; and None is identified by -6, 11. Again, # each pattern is unique, so the regression accomplishes the ANOVA. > cbind(as.character(Practice),MassedDummy,SpacedDummy,nonsense1,nonsense2) MassedDummy SpacedDummy nonsense1 nonsense2 [1,] "massed" "1" "0" "32" "3.14159265358979" [2,] "massed" "1" "0" "32" "3.14159265358979" [3,] "massed" "1" "0" "32" "3.14159265358979" [4,] "massed" "1" "0" "32" "3.14159265358979" [5,] "massed" "1" "0" "32" "3.14159265358979" [6,] "massed" "1" "0" "32" "3.14159265358979" [7,] "massed" "1" "0" "32" "3.14159265358979" [8,] "massed" "1" "0" "32" "3.14159265358979" [9,] "spaced" "0" "1" "2389" "0" [10,] "spaced" "0" "1" "2389" "0" [11,] "spaced" "0" "1" "2389" "0" [12,] "spaced" "0" "1" "2389" "0" [13,] "spaced" "0" "1" "2389" "0" [14,] "spaced" "0" "1" "2389" "0" [15,] "spaced" "0" "1" "2389" "0" [16,] "spaced" "0" "1" "2389" "0" [17,] "none" "0" "0" "-6" "11" [18,] "none" "0" "0" "-6" "11" [19,] "none" "0" "0" "-6" "11" [20,] "none" "0" "0" "-6" "11" [21,] "none" "0" "0" "-6" "11" [22,] "none" "0" "0" "-6" "11" [23,] "none" "0" "0" "-6" "11" [24,] "none" "0" "0" "-6" "11" # We calculated two contrasts using the defining formula from # the PowerPoint (slide three). The first contrast compares # the combined Massed and Spaced practice groups with the # None group: > SSc1 <- (1*mean(Time[1:8]) + 1*mean(Time[9:16]) - 2*mean(Time[17:24]))^2 > SSc1 <- SSc1 / (1^2/8 + 1^2/8 + (-2)^2/8) > SSc1 [1] 7726.688 # Any contrast has 1 df, so this is already a mean square. We # can calculate an F statistic to test the contrast by dividing # by the ANOVA error mean square. (The 1177 here comes from ANOVA # output that we have already seen in this transcript.) > Fc1 <- SSc1/1177.0 > Fc1 [1] 6.56473 # The contrast is significant, so we can reject the null hypothesis # and conclude that practice differs from no practice: > 1-pf(Fc1, 1, 21) [1] 0.01815602 # Here's the second contrast, which compares the Massed and Spaced groups: > SSc2 <- (1*mean(Time[1:8]) + (-1)*mean(Time[9:16]) + 0*mean(Time[17:24]))^2 > SSc2 <- SSc2 / (1^2/8 + 1^2/8 + 0^2/8) > SSc2 [1] 6045.062 > Fc2 <- SSc2 / 1177 > Fc2 [1] 5.135992 # It, too, is significant: > 1-pf(Fc2, 1, 21) [1] 0.03412287 # If these contrasts were planned, but not orthogonal, we would need to divide our # .05 alpha by the number of contrasts, and the second on would not be significant. # But because these contrasts ARE orthogonal (see slide six for the check of orthogonality), # and were specified in advance, we can interpret the tests without such an adjustment. # Here, we created a "contrast coding system" to help us understand the importance # of orthogonality. The variable c1 has the values 1 for the Massed and Spaced groups # and -2 for the None group (the same as our first contrast coefficients). The variable # c2 has the values 1 for Massed, -1 for Spaced, and 0 for None, just like our second # set of contrast coefficients. Because these two variables identify each group with # a unique pattern of values (1, 1 for Massed, 1, -1 for Spaced, and -2, 0 for None), # the regression will produce the same ANOVA. > c1 <- c(rep(1,16),rep(-2,8)) > c2 <- c(rep(1,8),rep(-1,8),rep(0,8)) > cbind(as.character(Practice),c1,c2) c1 c2 [1,] "massed" "1" "1" [2,] "massed" "1" "1" [3,] "massed" "1" "1" [4,] "massed" "1" "1" [5,] "massed" "1" "1" [6,] "massed" "1" "1" [7,] "massed" "1" "1" [8,] "massed" "1" "1" [9,] "spaced" "1" "-1" [10,] "spaced" "1" "-1" [11,] "spaced" "1" "-1" [12,] "spaced" "1" "-1" [13,] "spaced" "1" "-1" [14,] "spaced" "1" "-1" [15,] "spaced" "1" "-1" [16,] "spaced" "1" "-1" [17,] "none" "-2" "0" [18,] "none" "-2" "0" [19,] "none" "-2" "0" [20,] "none" "-2" "0" [21,] "none" "-2" "0" [22,] "none" "-2" "0" [23,] "none" "-2" "0" [24,] "none" "-2" "0" > summary(lm(Time~c1+c2)) Call: lm(formula = Time ~ c1 + c2) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 87.250 7.003 12.459 3.63e-11 *** c1 -12.688 4.952 -2.562 0.0182 * c2 -19.438 8.577 -2.266 0.0341 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 # But this time, the t tests on the slopes have exactly the # same p values as the F tests for the contrasts that we # calculated above. That's because these t tests are just the # square roots of the F statistics (subject to a bit of # rounding error here): > -2.562^2 [1] -6.563844 > Fc1 [1] 6.56473 # When we checked our contrast coefficients for orthogonality, what # we were actually doing was confiming that the variables in the # corresponding contrast coding system are uncorrelated: > cor(c1,c2) [1] 0 # Remember the artificial data set we used when we first introduced # multiple regression, where we saw that if the predictors (here, x1 and x2) # are perfectly uncorrelated, the slopes in simple regresisons were the # same as the slopes in the multiple regression? > uncor y x1 x2 1 15 1 1 2 -30 1 10 3 30 3 2 4 -5 3 9 5 45 5 3 6 20 5 8 7 60 7 4 8 45 7 7 9 75 9 5 10 70 9 6 > cor(uncor$x1,uncor$x2) [1] 0 # That was true because the two variables were trying to help # us understand completely separate parts of the variation in y. # The same idea holds for orthogonal contrasts: c1 and c2 are # helping us understand completely different parts of the variation # in Time, so it's almost as if we had a different sample of people # for each question. # If we have a set of orthogonal contrasts, we can make R calculate # them by forcing it to use a contrast coding parameterization. Here, # we create a matrix that contains the contrasts we are interested # in (keeping in mind that R thinks about levels of the factor in # alphabetical order): > # Massed None Spaced > mycontrasts <- matrix(c( + 1, -2, 1, + 1, 0, -1), 2, 3, byrow=TRUE) > mycontrasts [,1] [,2] [,3] [1,] 1 -2 1 [2,] 1 0 -1 # If we "transpose" these (i.e., rotate them into the same orientation # they had when we created c1 and c2), we can use a "contrasts" command # to force R to use that parameterization for the Practice factor: > t(mycontrasts) [,1] [,2] [1,] 1 1 [2,] -2 0 [3,] 1 -1 > contrasts(Practice) <- t(mycontrasts) # Note that the results for "Practice1" and "Practice2" # here are the same as the results for c1 and c2 in our # earlier analysis: > summary(lm(Time~Practice)) Call: lm(formula = Time ~ Practice) Residuals: Min 1Q Median 3Q Max -70.62 -22.25 4.00 28.34 43.38 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 87.250 7.003 12.459 3.63e-11 *** Practice1 -12.688 4.952 -2.562 0.0182 * Practice2 -19.438 8.577 -2.266 0.0341 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 34.31 on 21 degrees of freedom Multiple R-squared: 0.3578, Adjusted R-squared: 0.2967 F-statistic: 5.85 on 2 and 21 DF, p-value: 0.009559 >