Psychology 202a

Advanced Psychological Statistics

Study Guide for Final Exam, Fall 2019

As you prepare for an exam, it is useful to review the syllabus and ask yourself, "What was really important in each section of the class? What did we spend a lot of time on (either in class, or on homework assignments)?" With that in mind, let's review where we've been. After each major section, I have listed a number of questions that you can ask yourself to check your understanding. Please understand that these questions do not represent an exhaustive list of material that might be covered on the exam.

At the time of the midterm exam, we had just reviewed inference in the form of the two-sample independent-groups t-test and were about to begin the subject of simple and multiple linear regression. We introduced correlation as a method for addressing the question of how strong a linear relationship is (stressing the importance of the linearity of the relationship if correlation is to make sense). We discussed regression as a method for describing what the linear relationship is (as opposed to how strong). We investigated the rationale for least squares regression estimates, and defined the concept of the residual (as an estimate for error). We talked about inference in regression, and discussed the assumptions that are essential for inference to be valid. We stressed the idea of regression as a model for the conditional mean.

Next, we moved on to developing a rudimentary knowledge of regression with multiple predictors (sufficient to have that as a tool for understanding ANOVA). We introduced the added variable plot, partly as a useful device in its own right, but more importantly as a way of understanding what is going on when we estimate a partial relationship in multiple regression. We saw that the total sum of squares for the dependent variable can be broken down into a part associated with the model and a part associated with error (the decompostion of the sum of squares). We introduced inference in regression, focusing primarily on the F statistic (although we did deal briefly with inference about specific regression parameters through t-tests). We saw that the decomposed sums of squares could be transformed into something like variance estimates by dividing by degrees of freedom; the resulting numbers are called "mean squares." We saw that the ratio of the MS model over MS error has an F distribution under the null hypothesis that all slopes in the model are simultaneously zero, and that inference about the model as a whole can thus be performed.

Whenever we introduce a new inference approach, we must understand the assumptions that are needed in order for the method to be valid. We discussed the regression assumptions (linearity, independent errors, homoscedasticity of errors, normality of errors), and discussed methods for assessing them. Residuals plots and normal Q-Q plots were introduced as devices to help check assumptions.

Some questions that you should consider as you review multiple regression include:

Can I explain how to produce an added variable plot?

What do I hope to see in an added variable plot?

Why might I want to look at an added variable plot in addition to a plot of the raw relationship?

Do I understand the relationship between the slope in an added variable plot and the slope in a multiple regression?

Do I understand the ANOVA table in multiple regression? (Where do the degrees of freedom come from? What sort of additive relationships exist in the table? Could I complete the table if it had some gaps in it?)

How could I do inference using the information in the ANOVA table?

What should I look for in a plot of residuals versus predicted values?

What are the assumptions needed for inference in regression?

Why are they important?

How can I assess them?

We then moved on to the subject of ANOVA. We began with a conceptual approach to the logic of ANOVA. We saw that within-cell variation provides a way to estimate population variance in a manner that does not depend on the truth or falsehood of the null hypothesis about equality of the means. We then observed that, by algebraically isolating variance in the central limit theorem's statement about variability of the means, we could show that variation of the sample means could also lead to an estimate of population variance. However, this new estimate is reasonable only if the population means are equal; if they are unequal, the estimate will be too large. Hence, if we consider the ratio of the two estimates, the result (an F statistic) will be large if the null hypothesis is false, and will tend to be near 1.0 if the null hypothesis is true.

We considered practical ways to estimate ANOVA models, focusing on R's lm() function, and letting R figure out how to parmaterize the problem. Then we considered ANOVA as a special case of the linear model. We developed several coding systems for making R estimate the ANOVA as a special case of regression, and ultimately observed that any system of k-1 predictors that uniquely identifies the k groups will allow us to estimate the ANOVA model. We illustrated this idea with two sensible systems (dummy coding and effects coding), but also noted that we could get the same results with a completely nonsensical system of variables.

Motivated by the idea that the overall ANOVA F test is not all that interesting without the additional ability to ask which means differ, we turned our attention to one particular coding method, orthogonal contrast coding. We saw that if we can create a system in which the k-1 predictors are uncorrelated with one another, then the model sum of squares is broken down into completely independent chunks. We discussed this phenomenon as a justification for testing these independent chunks of information without worrying further about the accumulating chance of committing a Type I error. We also saw how to estimate such contrasts in R by attaching a contrast matrix to the factor.

We noted that the real world does not always conveniently present itself to us in the form of questions that happen to be neatly orthogonal. We discussed alternatives to orthogonal contrasts including adjusted non-orthogonal contrasts and post hoc methods.

We discussed assumptions needed for inference in ANOVA. These include independence within and between groups, homoscedasticity across the populations defined by the groups, and normality within each population. We noted the similarity to regression assumptions and t-test assumptions, and discussed methods for assessing the assumptions.

Some questions that you might ask to check your understanding are:

Can I describe and write the overall null hypothesis for the ANOVA?

Can I correctly interpret R ANOVA output?

Can I describe the conclusion of a hypothesis test in a way that relates the result to the meaning of the particular variables that interest me?

Can I develop a coding system that will do the ANOVA without using making R parametrize the problem?

Can I define a set of orthogonal contrasts, or check to see if an existing set of contrasts is orthogonal?

Can I explain why orthogonality is important?

Can I interpret R output for Tukey's HSD?

Do I understand the structure of the ANOVA table sufficiently well that I can complete one, given sparse information?

How would I go about checking ANOVA assumptions?

We moved next to a digression on power analysis. First, we defined Type I and Type II errors; that served as a basis for defining power (as one minus beta, where beta is the probability of a Type II error). We discussed various factors that have an impact on power: magnitude of the true effect, error variability, sample size, alpha level, and tailedness of the test (one- or two-tailed).

Some questions that you should consider include:

Can I explain what changes might increase or reduce power in a particular situation?

Can I interpret R or Gpower output that is relevant to power?

How might I go about choosing a particular way in which the null hypothesis is false if I want to do a power analysis?

What is a noncentrality parameter? How is it relevant to power analysis?

Finally, we considered two-way ANOVA. We noted that adding a second factor adds not only the possibility of a second main effect associated with that factor, but also the possibility that the two factors work together, or "interact." Hence there are three null hypotheses in two-way ANOVA: the two null hypotheses about main effects of factors A and B, and the null hypothesis that says the effect of A is not dependent on level of B (or vice versa). We argued that usually (but not always) the interaction, if it is present, will be the most interesting thing going on. We discussed graphical methods for depicting the interaction, and illustrated the process of describing interactions verbally by telling different stories about the behavior of one factor for each level of a second factor.

We discussed different ways of understanding two-way ANOVA. You saw that sums of squares for the main effects are identical to the sums of squares that we would calculate if we simply didn't know about the second factor, and calculated a one-way ANOVA. The interaction sum of squares can be understood by creating a variable that identified the cells of the crossed design and calculating a simple ANOVA sum of squares on that factor. The resulting sum of squares represents total between-cell variation; when we subtract the portion of that variation that has already been attributed to main effects, the result is the intereaction sum of squares. You should be sure that you understand material in the reading related to the calculation of degrees of freedom for the various effects and for the error term, and that you understand the structure of the two-way ANOVA table.

A second way to understand ANOVA relates to regression coding systems for accomplishing ANOVA as a linear model. We saw that if we created a coding system for each main effect, and then created a new set of variables consisting of the cross products of those main-effects variables, the new set would capture the idea of the interaction. (This makes sense of the fact that the interaction df is equal to the product of the main-effects dfs.) Since, in this view, the interaction is represented by a group of variables that collectively constitute one idea, we argued that an approach knowd as the nested F test would be appropriate, and saw that such a test is indeed identical to the F test for the interaction in the standard two-way ANOVA approach. (A nested F test drops a set of predictors from a regression model, observes how the model sum of squares and degrees of freedom change, calculates a mean square that represents loss of explanatory power in the model associated with the dropping of the set of parameters, and calculates an F statistic using the error mean square in the original model.)

We discussed assumptions for inference in two-way ANOVA, and we mentioned that all of the a priori and post hoc tests that were possible in one-way ANOVA are also available for two-way ANOVA.

Some questions about two-way ANOVA that you might want to ponder include:

Can I produce and interpret an interaction plot, given a table of means?

Can I describe and write the null hypotheses in two-way ANOVA?

Can I interpret inferential results about main effects and the interaction in a way that sheds light on the particular variables I am studying?

Can I complete a partially vacant two-way ANOVA table?

Can I interpret R output relevant to two-way ANOVA?

Can I describe how I would check assumptions related to two-way ANOVA?

On the final day of class, we talked about the problem of unbalanced ANOVA, and noted that using tests based on the Type III sums of squares provides a solution if we can convince ourselves that the lack of balance in the ANOVA does not contain a fundamental truth about how the main effects interact. The Type III sums of squares provide inference that estimate what the ANOVA would have looked like if the design were balanced; that is relevant only if it makes sense to be interested in what things would have looked like if the design were balanced.

We introduced the concept of random-effects factors in ANOVA. We noted that a factor should be considered a random factor if the levels of the factor that we include in our design represent a sample of the larger universe of possible levels that might interest us. If, on the other hand, the levels we include represent exactly the levels that interest us, the factor is a fixed effect.

In one-way ANOVA, deciding that a factor represents a random effect changes the null hypothesis from one about the equality of means to one that states that a variance component representing variation in population means is zero. In two-way ANOVA, there is the added complication that the question of whether a factor is a fixed- or a random-effect changes how we perform inference about the other factor. Specifically, if factor B is a random effect, the F statistic for factor A should use the interaction mean square in the denominator, not the usual error mean square.

Some questions that you should consider include:

Can I identify situations for which the Type III sums of squares do or do not provide a satisfactory solution to the problem of unbalanced ANOVA?

Can I correctly identify a factor that should be considered random?

Can I correct the F statistics in an ANOVA table that doesn't recognize the presence of random effects?