# A student asked a question about what a nonlinear relationship
# would look like in a scatterplot. We started by addressing a
# common misconception: a non-relationship is NOT a nonlinear
# relationship. For example...

> plot(rnorm(30),rnorm(30))

# ...looks like a random blob (in fact, this would have been a
# really nice residuals plot), and people are tempted to say
# "that's not linear." But in fact, it's a linear relationship
# with a slope of zero.

# Here's a linear relationship with a slope of 10:

> x <- rnorm(50, 50, 10)
> y <- 10 + 10*x
> plot(x,y)

# If we add some noise, it still looks linear in the scatterplot:

> plot(x,y+rnorm(50,0,30))

# Now we create a relationship that is curvilinear:

> y <- 10 + .3*x^2.5
> plot(x,y)

# If I add some noise, I can still vaguely see a curve in 
# the scatterplot, but it's obscure:

> noise <- rnorm(50,0,500)
> plot(x,y+noise)

# However, the curve is much more evident in the residuals plot:

> temp <- lm((y+noise)~x)
> plot(temp$fit,temp$res)

# We returned to our Eysenck ANOVA example. Last time, we
# created a dummy coding system to identify the five groups:

> attach(Eysenck)
> cbind(Group,d1,d2,d3,d4)
      Group       d1  d2  d3  d4 
 [1,] "counting"  "1" "0" "0" "0"
 [2,] "counting"  "1" "0" "0" "0"
 [3,] "counting"  "1" "0" "0" "0"
 [4,] "counting"  "1" "0" "0" "0"
 [5,] "counting"  "1" "0" "0" "0"
 [6,] "counting"  "1" "0" "0" "0"
 [7,] "counting"  "1" "0" "0" "0"
 [8,] "counting"  "1" "0" "0" "0"
 [9,] "counting"  "1" "0" "0" "0"
[10,] "counting"  "1" "0" "0" "0"
[11,] "rhyming"   "0" "1" "0" "0"
[12,] "rhyming"   "0" "1" "0" "0"
[13,] "rhyming"   "0" "1" "0" "0"
[14,] "rhyming"   "0" "1" "0" "0"
[15,] "rhyming"   "0" "1" "0" "0"
[16,] "rhyming"   "0" "1" "0" "0"
[17,] "rhyming"   "0" "1" "0" "0"
[18,] "rhyming"   "0" "1" "0" "0"
[19,] "rhyming"   "0" "1" "0" "0"
[20,] "rhyming"   "0" "1" "0" "0"
[21,] "adjective" "0" "0" "1" "0"
[22,] "adjective" "0" "0" "1" "0"
[23,] "adjective" "0" "0" "1" "0"
[24,] "adjective" "0" "0" "1" "0"
[25,] "adjective" "0" "0" "1" "0"
[26,] "adjective" "0" "0" "1" "0"
[27,] "adjective" "0" "0" "1" "0"
[28,] "adjective" "0" "0" "1" "0"
[29,] "adjective" "0" "0" "1" "0"
[30,] "adjective" "0" "0" "1" "0"
[31,] "imagery"   "0" "0" "0" "1"
[32,] "imagery"   "0" "0" "0" "1"
[33,] "imagery"   "0" "0" "0" "1"
[34,] "imagery"   "0" "0" "0" "1"
[35,] "imagery"   "0" "0" "0" "1"
[36,] "imagery"   "0" "0" "0" "1"
[37,] "imagery"   "0" "0" "0" "1"
[38,] "imagery"   "0" "0" "0" "1"
[39,] "imagery"   "0" "0" "0" "1"
[40,] "imagery"   "0" "0" "0" "1"
[41,] "intent"    "0" "0" "0" "0"
[42,] "intent"    "0" "0" "0" "0"
[43,] "intent"    "0" "0" "0" "0"
[44,] "intent"    "0" "0" "0" "0"
[45,] "intent"    "0" "0" "0" "0"
[46,] "intent"    "0" "0" "0" "0"
[47,] "intent"    "0" "0" "0" "0"
[48,] "intent"    "0" "0" "0" "0"
[49,] "intent"    "0" "0" "0" "0"
[50,] "intent"    "0" "0" "0" "0"
> d1
 [1] 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[39] 0 0 0 0 0 0 0 0 0 0 0 0

# When we regress on a factor like Group, R knows that we want
# it to create a coding system like that for us. If we look at
# the output for the individual variables R has created, we can
# figure out what coding it used:

> summary(lm(Score~Group))

Call:
lm(formula = Score ~ Group)

Residuals:
   Min     1Q Median     3Q    Max 
 -7.00  -1.85  -0.45   2.00   9.60 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)    11.0000     0.9835  11.184 1.39e-14 ***
Groupcounting  -4.0000     1.3909  -2.876  0.00614 ** 
Groupimagery    2.4000     1.3909   1.725  0.09130 .  
Groupintent     1.0000     1.3909   0.719  0.47589    
Grouprhyming   -4.1000     1.3909  -2.948  0.00506 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.11 on 45 degrees of freedom
Multiple R-squared:  0.4468,    Adjusted R-squared:  0.3976 
F-statistic: 9.085 on 4 and 45 DF,  p-value: 1.815e-05

# The intercept is 11, and that will match the group that
# was coded 0, 0, 0, 0. That's the adjective group:

> tapply(Score, Group, mean)
adjective  counting   imagery    intent   rhyming 
     11.0       7.0      13.4      12.0       6.9 

# The first slope is -4, and 11 - 4 gets us the mean of the
# counting group; so R has created a dummy code for the Counting
# group. In fact, R has used a dummy coding system just like
# the one we created, except that it chose the group that was
# alphabetically first to be the one that didn't get its own
# dummy code (where we chose the Intentional group).


# We can also ask R directly how it is coding the factor:

> contrasts(factor(Group))
          counting imagery intent rhyming
adjective        0       0      0       0
counting         1       0      0       0
imagery          0       1      0       0
intent           0       0      1       0
rhyming          0       0      0       1

# In the Powerpoint, we created a set of four linear contrasts
# that represented reasonable questions about the five group
# means. The first one compared the Counting and Rhyming groups
# with the Adjective and Imagery groups. Slide Five in today's
# Powerpoint gives the formula for calculating a contrast sum
# of squares (which is also the mean square, because contrasts
# always have one degree of freedom). We calculate that here:

> (1/2*13.4 + 1/2*11 - 1/2*7 - 1/2*6.9)^2 / ((-1/2)^2/10 + (-1/2)^2/10 + (1/2)^2/10 + (1/2)^2/10) -> c1MS
> c1MS
[1] 275.625

# If we divide that mean square by the error mean square from
# the ANOVA, we get a test of the contrast:

> Fc1 <- c1MS/ 9.673
> Fc1
[1] 28.49426

# The test is highly significant; there is clear evidence that
# the two highest processing groups differ from the two lowest:

> 1-pf(Fc1,1,45)
[1] 2.962611e-06

# We demonstrated in the Powerpoint that the particular set
# of contrasts we defined are mutually orthogonal. What that
# means, in effect, is that we can use them to create a coding
# system for the ANOVA in which the four predictors are perfectly
# uncorrelated with each other, dividing the explaind variability
# of Score into completely separate partitions. The coding system
# is created by rotating the contrasts 90 degrees. For example,
# the first contrast (-1/2, -1/2, 1/2, 1/2, 0) becomes a variable
# with the values -1/2 for the Counting group, -1/2 for the Rhyming
# group, 1/2 for the Adjective group, 1/2 for the Imagery group,
# and 0 for the Intentional group:

> c1 <- c(rep(-1/2, 20), rep(1/2, 20), rep(0,10) )

# We create the other three contrasts the same way:

> c2 <- c(rep(0, 20), rep(-1,10), rep(1,10), rep(0,10))
> c3 <- c(rep(-1, 10), rep(1, 10), rep(0,30))
> c4 <- c(rep(-1, 40), rep(4, 10))

# We now have a system of four variables that identifies each
# group with a unique pattern of values, so we know that this
# will estimate the same ANOVA:

> cbind(Group,c1,c2,c3,c4)
      Group       c1     c2   c3   c4  
 [1,] "counting"  "-0.5" "0"  "-1" "-1"
 [2,] "counting"  "-0.5" "0"  "-1" "-1"
 [3,] "counting"  "-0.5" "0"  "-1" "-1"
 [4,] "counting"  "-0.5" "0"  "-1" "-1"
 [5,] "counting"  "-0.5" "0"  "-1" "-1"
 [6,] "counting"  "-0.5" "0"  "-1" "-1"
 [7,] "counting"  "-0.5" "0"  "-1" "-1"
 [8,] "counting"  "-0.5" "0"  "-1" "-1"
 [9,] "counting"  "-0.5" "0"  "-1" "-1"
[10,] "counting"  "-0.5" "0"  "-1" "-1"
[11,] "rhyming"   "-0.5" "0"  "1"  "-1"
[12,] "rhyming"   "-0.5" "0"  "1"  "-1"
[13,] "rhyming"   "-0.5" "0"  "1"  "-1"
[14,] "rhyming"   "-0.5" "0"  "1"  "-1"
[15,] "rhyming"   "-0.5" "0"  "1"  "-1"
[16,] "rhyming"   "-0.5" "0"  "1"  "-1"
[17,] "rhyming"   "-0.5" "0"  "1"  "-1"
[18,] "rhyming"   "-0.5" "0"  "1"  "-1"
[19,] "rhyming"   "-0.5" "0"  "1"  "-1"
[20,] "rhyming"   "-0.5" "0"  "1"  "-1"
[21,] "adjective" "0.5"  "-1" "0"  "-1"
[22,] "adjective" "0.5"  "-1" "0"  "-1"
[23,] "adjective" "0.5"  "-1" "0"  "-1"
[24,] "adjective" "0.5"  "-1" "0"  "-1"
[25,] "adjective" "0.5"  "-1" "0"  "-1"
[26,] "adjective" "0.5"  "-1" "0"  "-1"
[27,] "adjective" "0.5"  "-1" "0"  "-1"
[28,] "adjective" "0.5"  "-1" "0"  "-1"
[29,] "adjective" "0.5"  "-1" "0"  "-1"
[30,] "adjective" "0.5"  "-1" "0"  "-1"
[31,] "imagery"   "0.5"  "1"  "0"  "-1"
[32,] "imagery"   "0.5"  "1"  "0"  "-1"
[33,] "imagery"   "0.5"  "1"  "0"  "-1"
[34,] "imagery"   "0.5"  "1"  "0"  "-1"
[35,] "imagery"   "0.5"  "1"  "0"  "-1"
[36,] "imagery"   "0.5"  "1"  "0"  "-1"
[37,] "imagery"   "0.5"  "1"  "0"  "-1"
[38,] "imagery"   "0.5"  "1"  "0"  "-1"
[39,] "imagery"   "0.5"  "1"  "0"  "-1"
[40,] "imagery"   "0.5"  "1"  "0"  "-1"
[41,] "intent"    "0"    "0"  "0"  "4" 
[42,] "intent"    "0"    "0"  "0"  "4" 
[43,] "intent"    "0"    "0"  "0"  "4" 
[44,] "intent"    "0"    "0"  "0"  "4" 
[45,] "intent"    "0"    "0"  "0"  "4" 
[46,] "intent"    "0"    "0"  "0"  "4" 
[47,] "intent"    "0"    "0"  "0"  "4" 
[48,] "intent"    "0"    "0"  "0"  "4" 
[49,] "intent"    "0"    "0"  "0"  "4" 
[50,] "intent"    "0"    "0"  "0"  "4" 

# When we confirmed that the contrasts were orthogonal, we
# were actually assuring ourselves that these variables would
# be perfectly uncorrelated:

> cor(c1,c2)
[1] 0
> cor(c1,c3)
[1] 0
> cor(c1,c4)
[1] 0
> cor(c2,c3)
[1] 0
> cor(c2,c4)
[1] 0
> cor(c3,c4)
[1] 0

# Because each variable is uncorrelated with the others,
# each one works with a completely separate particion of
# the variability in Score, and we agree that we can interpret
# the tests. (Note that the p value for c1 matches what we
# got earlier when we calculated the contrast from the 
# formula for the contrast sum of squares.)

> summary(lm(Score~c1+c2+c3+c4))

Call:
lm(formula = Score ~ c1 + c2 + c3 + c4)

Residuals:
   Min     1Q Median     3Q    Max 
 -7.00  -1.85  -0.45   2.00   9.60 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  10.0600     0.4399  22.872  < 2e-16 ***
c1            5.2500     0.9835   5.338 2.96e-06 ***
c2            1.2000     0.6955   1.725   0.0913 .  
c3           -0.0500     0.6955  -0.072   0.9430    
c4            0.4850     0.2199   2.205   0.0326 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.11 on 45 degrees of freedom
Multiple R-squared:  0.4468,    Adjusted R-squared:  0.3976 
F-statistic: 9.085 on 4 and 45 DF,  p-value: 1.815e-05

# There's actually a much easier way to do orthogonal
# contrasts in R. We've seen before that we can use the
# contrasts() function to see how R has paramaterized things:
 
> contrasts(factor(Group))
          counting imagery intent rhyming
adjective        0       0      0       0
counting         1       0      0       0
imagery          0       1      0       0
intent           0       0      1       0
rhyming          0       0      0       1
 
# But we can also use that function to force R to use a
# particular coding system. First, we create a matrix of
# our contrasts. It's important to remember, though, that
# R thinks about the levels of the factor in alphabetical
# order, so it helps to have a comment that reminds us of
# that order when we set up the matrix:

> #          adjective  counting  imagery  intent  rhyming
> temp <- matrix(c(
+                1/2,      -1/2,    1/2,     0,      -1/2,
+                -1,          0,      1,     0,         0,
+                 0,         -1,      0,     0,         1,
+                -1,         -1,     -1,     4,        -1), 4, 5, byrow=TRUE) # 4 rows, 5 columns
> temp
     [,1] [,2] [,3] [,4] [,5]
[1,]  0.5 -0.5  0.5    0 -0.5
[2,] -1.0  0.0  1.0    0  0.0
[3,]  0.0 -1.0  0.0    0  1.0
[4,] -1.0 -1.0 -1.0    4 -1.0

# So those are the contrasts we defined, but remember that the
# coding system has to rotate them 90 degrees:

> t(temp)
     [,1] [,2] [,3] [,4]
[1,]  0.5   -1    0   -1
[2,] -0.5    0   -1   -1
[3,]  0.5    1    0   -1
[4,]  0.0    0    0    4
[5,] -0.5    0    1   -1

# Here's the current coding R is using:

> contrasts(factor(Group))
          counting imagery intent rhyming
adjective        0       0      0       0
counting         1       0      0       0
imagery          0       1      0       0
intent           0       0      1       0
rhyming          0       0      0       1

# Here, we tell R that we would rather use our orthogonal
# contrast coding:

> Group <- factor(Group)
> contrasts(Group) <- t(temp)

# And that's the coding system that is now attached to
# the factor Group:

> contrasts(Group)
          [,1] [,2] [,3] [,4]
adjective  0.5   -1    0   -1
counting  -0.5    0   -1   -1
imagery    0.5    1    0   -1
intent     0.0    0    0    4
rhyming   -0.5    0    1   -1

# We'll still get the same ANOVA:

> anova(lm(Score~Group))
Analysis of Variance Table

Response: Score
          Df Sum Sq Mean Sq F value    Pr(>F)    
Group      4 351.52  87.880  9.0848 1.815e-05 ***
Residuals 45 435.30   9.673                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# But, because we have created a set of orthogonal contrasts,
# we know that we can now go ahead and test the specific slopes.
# Note that the tests here exactly match what we got when we
# created c1, c2, c3, and c4 and did the multiple regression:

> summary(lm(Score~Group))

Call:
lm(formula = Score ~ Group)

Residuals:
   Min     1Q Median     3Q    Max 
 -7.00  -1.85  -0.45   2.00   9.60 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  10.0600     0.4399  22.872  < 2e-16 ***
Group1        5.2500     0.9835   5.338 2.96e-06 ***
Group2        1.2000     0.6955   1.725   0.0913 .  
Group3       -0.0500     0.6955  -0.072   0.9430    
Group4        0.4850     0.2199   2.205   0.0326 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.11 on 45 degrees of freedom
Multiple R-squared:  0.4468,    Adjusted R-squared:  0.3976 
F-statistic: 9.085 on 4 and 45 DF,  p-value: 1.815e-05

# A student asked for an example of contrasts that aren't
# orthogonal. Here's our original first contrast, comparing
# the two lowest groups with the two highest (but using
# 1's and -1's instead of 1/2 and -1/2; it still asks the
# same question):
 
> contrast1 <- c(1,1,-1,-1,0)

# Here's another contrast that might interest us (comparing
# the Rhyming and Adjective groups):

> contrast2 <- c(0,1,-1,0,0)

# But the two contrasts are not orthogonal, so we could not
# include both of them in a mutually orthogonal set:

> cor(contrast1,contrast2)
[1] 0.7071068
>