Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Overview
Proficiency with statistical software packages is indispensable today for serious research in the social
sciences. SPSS is one of the most widely used and powerful statistical software packages. It covers a
broad range of statistical procedures that allow one to summarize data (e.g., compute means and
standard deviations), determine whether there are significant differences between groups (e.g., t tests,
analysis of variance), examine relationships among variables (e.g., correlation, multiple regression),
and graph results (e.g., bar charts, line graphs). These tutorials show screenshots of SPSS 15, the
newest version at the time the tutorials were written. If you are using a different version of SPSS, your
screens may not look exactly like those presented in the tutorials, but the basic functionality should be
the same or very similar. These pages are based on a series of SPSS tutorials originally written by Dr.
Gil Einstein and Dr. Ken Abernethy of Furman University.
The lessons presented here give you an introduction to SPSS. They are not designed to teach you
statistics, but are intended for individuals who already have some background in statistics and want to
learn SPSS, or to be used as a supplement to a statistics course and text. When you are comfortable
with SPSS, we encourage you to explore the SPSS menus and options, because the package is very
powerful and there are usually multiple ways to accomplish your statistical goals. Unless you are
already familiar with SPSS, you should start with Lesson 1, which presents a brief overview of the
different types of windows and files available with SPSS. Lesson 2 describes how to enter and label
your data, transform data, select cases, and sort cases. Lesson 3 shows you how to generate various
descriptive statistics and some simple graphical representations of your data.
Once you understand how to enter and manipulate data and to generate and report statistical results,
you can then go on to any of the other lessons. Each of these lessons includes a research problem with
a hypothetical set of data, and step-by-step directions for how to perform the specified analyses. An
additional example for further practice is also included for many of the lessons. Lessons 4 - 9 describe
specific statistical procedures used to compare the means of two or more groups (t-tests and analysis
of variance). Lesson 10 covers correlation and Lesson 11 covers linear regression for the bivariate
case (one independent variable and one dependent variable). Lesson 12 covers multiple regression
(one dependent variable and two or more independent variables). In Lesson 13 you will learn how to
conduct and interpret chi-square analyses for categorical data arranged in one-way tables (goodnessof-fit tests) and two-way tables (tests of independence). Lesson 14 introduces analysis of covariance
(ANCOVA), a technique combining regression and analysis of variance.
An Important Point to Remember
Please note that these tutorials cover only a few of the most basic statistical procedures available with
SPSS. After you have worked through these tutorials, you will have familiarity with SPSS. With this
familiarity and an understanding of the statistical test that you wish to use, we are confident that you
will be able to figure out other procedures on your own. You may want to bookmark this site, as new
material is being added on a regular basis. Your feedback is also very welcome and appreciated. You
may provide feedback, make suggestions for additional tutorials, or report errors by clicking on the
Provide Feedback link in the navigation menu, or by e-mailing the site's webmaster.
SPSS TUTORIALS
Page 1
SPSS TUTORIALS
Page 2
SPSS TUTORIALS
Page 3
SPSS TUTORIALS
Page 4
SPSS TUTORIALS
Page 5
reference. Unlike earlier versions of SPSS, version 15, the version illustrated in these tutorials,
automatically presents in the SPSS Viewer the syntax version of the commands you give it when you
point and click in the Data Editor or the SPSS Viewer (examine Figure 1-3 for an example).
Now that you know the kinds of windows and files involved in an SPSS session, you are ready to
learn how to enter, structure, and manipulate data. Those are the subjects of Lesson 2.
SPSS TUTORIALS
Page 6
Overview
Data can be entered directly into the SPSS Data Editor or imported from a variety of file types. It is
always important to check data entries carefully and ensure that the data are accurate. In this lesson
you will learn how to build an SPSS data file from scratch, how to calculate a new variable, how to
select and sort cases, and how to split a file into separate layers.
Creating a Data File
A common first step in working with SPSS is to create or open a data file. We will assume in this
lesson that you will type data directly into the SPSS Data Editor to create a new data file. You should
realize that you can also read data from many other programs, or copy and paste data from worksheets
and tables to create new data files.
Launch SPSS. You will be given various options, as we discussed in Lesson 1. Select Type in Data
or Cancel . You should now see a screen similar to the following, which is a blank dataset in the Data
View of the SPSS Data Editor (see Figure 2-1):
SPSS TUTORIALS
Page 7
should also appear in a separate column. So if you were looking at the scores for five quizzes for each
of 20 students, the data for each student would occupy a single row (line) in the data table, and the
score for each quiz would occupy a separate column.
Although SPSS automatically numbers the rows of the data table, it is a very good habit to provide a
separate participant (or subject) number column so that records can be easily sorted, filtered, or
selected. Best practice also requires setting up the data structure for the data. For this purpose, we will
switch to the Variable View of the Data Editor by clicking on the Variable View tab at the bottom of
the Data Editor window. See Figure 2-2.
SPSS TUTORIALS
Page 8
Example Data
Let us establish the data structure for our example of five quizzes and 20 students. We will assume
that we also know the age and the sex of each student. Although we could enter "F" for female and
"M" for male, most statistical procedures are easier to perform if a number is used to code such
categorical variables. Let us assign the number "1" to females and the number "0" to males. The
hypothetical data are shown below:
Student
Sex
Age
Quiz1
Quiz2
Quiz3
Quiz4
Quiz5
18
83
87
81
80
69
19
76
89
61
85
75
17
85
86
65
64
81
20
92
73
76
88
64
23
82
75
96
87
78
18
88
73
76
91
81
21
89
71
61
70
75
20
89
70
87
76
88
23
92
85
95
89
62
10
21
86
83
77
64
63
11
23
90
71
91
86
87
12
18
84
71
67
62
70
13
21
83
80
89
60
60
14
17
79
77
82
63
74
15
19
89
80
64
94
78
16
20
76
85
65
92
82
17
19
92
76
76
74
91
18
22
75
90
78
70
76
19
22
87
87
63
73
64
20
20
75
74
63
91
87
SPSS TUTORIALS
Page 9
Student
Sex
Age
Quiz1
Quiz2
Quiz3
Quiz4
Quiz5
No decimals appear in our raw data, so we will set the number of decimals to zero. After we enter the
desired information, the completed data structure might appear as follows:
SPSS TUTORIALS
Page
10
Notice that we provided value labels for Sex, so we won't confuse our 1's and 0's later. To do this,
click on Values in the Sex variable row and enter the appropriate labels for males and females (see
Figure 2-4).
SPSS TUTORIALS
Page
11
SPSS TUTORIALS
Page
12
SPSS TUTORIALS
Page
13
When you click OK, the new variable appears in both the data and variable views (see below). As
discussed earlier, you can change the number of decimals (numerical variables default to two
decimals) and add a descriptive label for the new variable.
SPSS TUTORIALS
Page
14
SPSS TUTORIALS
Page
15
SPSS TUTORIALS
Page
16
SPSS TUTORIALS
Page
17
SPSS TUTORIALS
Page
18
SPSS TUTORIALS
Page
19
Move Sex and Age to the "Sort by" window (see Figure 2-15) and then click OK.
SPSS TUTORIALS
Page
20
SPSS TUTORIALS
Page
21
Figure 2-18 Split file results in separate analysis for each group
SPSS TUTORIALS
Page
22
Overview
In this lesson, you will learn how to produce various descriptive statistics, simple frequency
distribution tables, and frequency histograms. You will also learn how to explore your data and create
boxplots.
Example
Let us return to our example of 20 students and five quizzes. We would like to calculate the average
score (mean) and standard deviation for each quiz. We will also look at the mean scores for men and
women on each quiz. Open the SPSS data file you saved in Lesson 2, or click here for lesson_3.sav.
Remember that we previously calculated the average quiz score for each person and included that as a
new variable in our data file.
To calculate the means and standard deviations for age, all quizzes, and the average quiz score, select
Analyze, then Descriptive Statistics, and then Descriptives as shown in the following screenshot
(see Figure 3-1).
SPSS TUTORIALS
Page
23
Figure 3-2 Move the desired variables into the variables window.
In the resulting dialog box, make sure you check (at a minimum) the boxes in front of Mean and Std.
deviation:
SPSS TUTORIALS
Page
24
SPSS TUTORIALS
Page
25
SPSS TUTORIALS
Page
26
SPSS TUTORIALS
Page
27
histogram for a visual check as to how the data are distributed. Let us examine the distribution of ages
of our 20 hypothetical students. Select Analyze, Descriptive Statistics, Frequencies (see Figure 3-8).
In the Frequencies dialog, move Age to the variables window, and then click on Charts. Select
Histograms and check the box in front of With normal curve (see Figure 3-9).
SPSS TUTORIALS
Page
28
SPSS TUTORIALS
Page
29
SPSS TUTORIALS
Page
30
SPSS TUTORIALS
Page
31
Involve
Test1
Student
Involve
Test1
78.6
81.0
64.9
10
69.5
100.0
11
73.8
83.7
12
66.7
94.0
13
54.8
78.2
14
69.3
76.9
15
73.5
82.0
16
79.4
SPSS TUTORIALS
Page
32
SPSS TUTORIALS
Page
33
SPSS TUTORIALS
Page
34
SPSS TUTORIALS
Page
35
SPSS TUTORIALS
Page
36
SPSS TUTORIALS
Page
37
Alone
Others
12
10
12
10
11
10
12
11
12
The following figure shows the variable view of the structure of the dataset:
SPSS TUTORIALS
Page
38
SPSS TUTORIALS
Page
39
SPSS TUTORIALS
Page
40
SPSS TUTORIALS
Page
41
SPSS TUTORIALS
Page
42
SPSS TUTORIALS
Page
43
SPSS TUTORIALS
Page
44
Figure 6-3 One-Way ANOVA dialog with Tukey HSD test selected
The ANOVA summary table and the post hoc test results appear in the SPSS Viewer (see Figure 6-4).
Note that the overall (omnibus) F ratio is significant, indicating that the means differ by a larger
amount than would be expected by chance alone if the null hypothesis were true. The post hoc test
results indicate that the mean for Method 1 is significantly lower than the means for Methods 2 and 3,
but that the means for Methods 2 and 3 are not significantly different.
SPSS TUTORIALS
Page
45
Figure 6-4 ANOVA summary table and post hoc test results
As an aid to understanding the post hoc test results, SPSS also provides a table of homogenous
subsets (see Figure 6-5). Note that it is not strictly necessary that the sample sizes be equal in the oneway ANOVA, and when they are unequal, the Tukey HSD procedure uses the harmonic mean of the
sample sizes for post hoc comparisons.
SPSS TUTORIALS
Page
46
Figure 6-6 ANOVA procedure and effect size index available from Means procedure
SPSS TUTORIALS
Page
47
The ANOVA summary table from the Means procedure appears in Figure 6-7 below. Eta squared is
directly interpretable as an effect size index: 58 percent of the variance in recall can be explained by
the method used for remembering the word list.
Figure 6-7 ANOVA table and effect size from Means procedure
SPSS TUTORIALS
Page
48
Before
After
SixMo
13
15
17
12
15
14
12
17
16
19
20
20
10
15
14
10
13
15
12
11
14
15
13
10
11
16
Coding Considerations
Data coding considerations in the repeated-measures ANOVA are similar to those in the pairedsamples t test. Each participant or subject takes up a single row in the data file, and each observation
SPSS TUTORIALS
Page
49
requires a separate column. The properly coded SPSS data file with the data entered correctly should
appear as follows (see figure 7-1). You may also retrieve a copy of the data file if you like.
SPSS TUTORIALS
Page
50
SPSS TUTORIALS
Page
51
After naming the factor and specifying the number of levels, you must add the factor and then define
it. Click on Add and then click on Define. See Figure 7-4.
SPSS TUTORIALS
Page
52
SPSS TUTORIALS
Page
53
Now click on Options and specify descriptive statistics, effect size, and contrasts (see Figure 7-7).
You must move Time to the Display Means window as well as specify a confidence level adjustment
for the main effects contrasts. A Bonferroni correction will adjust the alpha level in the post hoc
comparisons, while the default LSD (Fisher's least significant difference test) will not adjust the alpha
level. We will select the more conservative Bonferroni correction.
Figure 7-7 Specifying descriptive statistics, effect size, and mean contrasts
Click on Continue, then OK to run the repeated-measures ANOVA. The SPSS output provides
several tests. When there are multiple dependent variables, the multiviariate test is used to determine
whether there is an overall within-subjects effect for the combined depedendent variables. As there is
only one within-subject factor, we can ignore this test in the present case. Sphericity is an assumption
that the variances of the differences between the pairs of measures are equal. The insignificant test of
sphericity indicates that this assumption is not violated in the present case, and adjustments to the
degrees of freedom (and thus to the p level) are not required. The test of interest is the Test of WithinSubjects Effects. We can assume sphericity and report the F ratio as 8.149 with 2 and 18 degrees of
freedom and the p level as .003 (see Figure 7-8). Partial eta-squared has an interpretation similar to
that of eta-squared in the one-way ANOVA, and is directly interpretable as an effect-size index: about
48 percent of the within-subjects variation in algebra test performance can be explained by knowledge
of when the test was administered.
SPSS TUTORIALS
Page
54
SPSS TUTORIALS
Page
55
SPSS TUTORIALS
Page
56
SPSS TUTORIALS
Page
57
SPSS TUTORIALS
Page
58
Figure 8-3 SPSS data file data view for two-way ANOVA (partial data)
For ease of interpretation, the variables can be labelled and the values of each specified in the variable
view (see Figure 8-4).
SPSS TUTORIALS
Page
59
SPSS TUTORIALS
Page
60
SPSS TUTORIALS
Page
61
SPSS TUTORIALS
Page
62
SPSS TUTORIALS
Page
63
Closed Eyes
Simple
Distraction
Complex
Distraction
Younger
Younger
Younger
Younger
Older
Older
Older
Older
SPSS TUTORIALS
Page
64
repeated measures, which are the distraction conditions. As always it is helpful to include a column
for participant (or case) number.
The data appropriately entered in SPSS should look something like the following (see Figure 9-1).
You may optionally download a copy of the data file.
SPSS TUTORIALS
Page
65
SPSS TUTORIALS
Page
66
SPSS TUTORIALS
Page
67
Move the Closed, Simple, and Complex variables to levels 1, 2, and 3, respectively, and then move
Age to the Between-Subjects Factor(s) window (see Figure 9-5). You can optionally specify one or
more covariates for analysis of covariance.
Figure 9-5 The complete design specification for the mixed factorial ANOVA
To display a plot of the cell means, click on Plots, and then move Age to the Horizontal axis, and
distraction to Separate Lines. Next click on Add to specify the plot (see Figure 9-6) and then click
Continue.
SPSS TUTORIALS
Page
68
SPSS TUTORIALS
Page
69
8). Specifically you will want to determine whether there is a main effect for age, an effect for
distraction condition, and a possible interaction of the two. The tables of interest from the SPSS
Viewer are shown in Figures 9-8 and 9-9.
SPSS TUTORIALS
Page
70
SPSS TUTORIALS
Page
71
Overview
In correlational research, there is no experimental manipulation. Rather, we measure variables in their
natural state. Instead of independent and dependent variables, it is useful to think of predictors and
criteria. In bivariate (two-variable) correlation, we are assessing the degree of linear relationship
between a predictor, X, and a criterion, Y. In multiple regression, we are assessing the degree of
relationship between a linear combination of two or more predictors, X1, X2, ...Xk, and a criterion, Y.
We will address correlation in the bivariate case in Lesson 10, linear regression in the bivariate case in
Lesson 11, and multiple regression and correlation in Lesson 12.
The Pearson product moment correlation coefficient summarizes and quantifies the relationship
between two variables in a single number. This number can range from -1 representing a perfect
negative or inverse relationship to 0 representing no relationship or complete independence to +1
representing a perfect positive or direct relationship. When we calculate a correlation coefficient from
sample data, we will need to determine whether the obtained correlation is significantly different from
zero. We will also want to produce a scatterplot or scatter diagram to examine the nature of the
relationship. Sometimes the correlation is low not because of a lack of relationship, but because of a
lack of linear relationship. In such cases, examining the scatterplot will assist in determining whether
a relationship may be nonlinear.
Example Data
Suppose that you have collected questionnaire responses to five questions concerning dormitory
conditions from 10 college freshmen. (Normally you would like to have a larger sample, but the small
sample in this case is useful for illustration.) The questionnaire assesses the students' level of
satisfaction with noise, furniture, study area, safety, and privacy. Assume that you have also assessed
the students' family income level, and you would like to test the hypothesis that satisfaction with the
college living environment is related to wealth (family income).
The questionnaire contains five questions about satisfaction with the various aspects of the dormitory
"noise," "furniture," "space," "study," "safety," and "privacy." These are answered on a 5-point Likerttype scale (very dissatisfied to very satisfied), which are coded as 1 to 5. The data sheet for this study
is shown below.
Student
Income
Noise
Furniture
Study Area
Safety
Privacy
39
59
75
45
SPSS TUTORIALS
Page
72
95
115
67
48
140
10
55
SPSS TUTORIALS
Page
73
SPSS TUTORIALS
Page
74
Under the Options menu, let us select means and standard deviations and then click Continue. The
output contains a table of descriptive statistics (see Figure 10-4) and a table of correlations and related
significance tests (see Figure 10-5).
SPSS TUTORIALS
Page
75
Constructing a Scatterplot
For purposes of illustration, let us produce a scatterplot of the relationship between satisfaction with
noise level in the dormitory and family income. We see from the correlation matrix that this is a
significant negative correlation. As family income increases, satisfaction with the dormitory noise
level decreases. To build the scatterplot, select Graphs, Interactive, Scatterplot (see Figure 10-6).
Please note that there are several different ways to construct the scatterplot in SPSS, and that we are
illustrating only one here.
SPSS TUTORIALS
Page
76
SPSS TUTORIALS
Page
77
SPSS TUTORIALS
Page
78
SPSS TUTORIALS
Page
79
predicting how many people will be present on a given day, based on the outside temperature. The
data you collect are the following:
Temp
Attendance
50
87
77
60
67
73
53
86
75
59
70
65
83
65
85
62
80
58
64
89
SPSS TUTORIALS
Page
80
SPSS TUTORIALS
Page
81
The correlation and scatterplot indicate a strong, though by no means perfect, relationship between the
two variables. Let us now turn our attention to regression. We will "regress" the attendance (Y)on the
temperature (X). In linear regression, we are seeking the equation of a straight line that best fits the
observations. The usefulness of such a line may not be immediately apparent, but if we can model the
relationship by a straight line, we can use that line to predict a value of Y for any value of X, even
those that have not yet been observed. For example, looking at the scatterplot in Figure 11-3, what
attendance would you predict for a temperature of 60 degrees? The regression line can answer that
question. This line will have an intercept term and a slope coefficient and will be of the general form
The intercept and slope (regression) coefficient are derived in such a way that the sums of the squared
deviations of the actual data points from the line are minimized. This is called "ordinary least squares"
estimation or OLS. Note that the predicted value of Y (read "Y-hat") is a linear combination of two
constants, the intercept term and the slope term, and the value of X, so that the only thing that varies is
the value of X. Therefore, the correlation between the predicted Ys and the observed Ys will be the
same as the correlation between the observed Ys and the observed Xs. If we subtract the predicted
value of Y from the observed value of Y, the difference is called a "residual." A residual represents the
part of the Y variable that cannot be explained by the X variable. Visually, the distance between the
observed data points and the line of best fit represents the residual.
SPSS's Regression procedure allows us to determine the equation of the line of best fit, to calculate
predicted values of Y, and to calculate and interpret residuals. Optionally, you can save the predicted
values of Y and the residuals as either standard scores or raw-score equivalents.
Running the Regression Procedure
Open the data file in SPSS. Select Analyze, Regression, and then Linear (see Figure 11-4).
SPSS TUTORIALS
Page
82
SPSS TUTORIALS
Page
83
SPSS TUTORIALS
Page
84
SPSS TUTORIALS
Page
85
SPSS TUTORIALS
Page
86
SPSS TUTORIALS
Page
87
SPSS TUTORIALS
Page
88
Figure 11-10 Normal p-p plot of observed and expected cumulative probabilities of residuals
When there are significant departures from normality, homoscedasticity, and linearity, data
transformations or the introduction of polynomial terms such as quadratic or cubic values of the
original independent or dependent variables can often be of help (Edwards, 1976).
References
Edwards, A. L. (1976). An introduction to linear regression and correlation. San Francisco: Freeman.
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., and Tatham, R. L. (2006). Multivariate data
analysis (6th ed.). Upper Saddle River, NJ: Pearson Prentice Hall.
SPSS TUTORIALS
Page
89
Instead of using a to represent the Y intercept, it is common practice in multiple regression to call the
intercept term b0. The significance of Multiple R, and thus of the entire regression, must be tested. As
well, the significiance of the individual regression coefficients must be examined to verify that a
particular independent variable is adding significantly to the prediction.
As in simple linear regression, residual plots are helpful in diagnosing the degree to which the
linearity, normality, and homoscedasticity assumptions have been met. Various data transformations
can be attempted to accommodate situations of curvilinearity, non-normality, and heteroscedasticity.
In multiple regression we must also consider the potential impact of multicollinearity, which is the
degree of linear relationship among the predictors. When there is a high degree of collinearity in the
predictors, the regression equation will tend to be distorted, and may lead to inappropriate conclusions
regarding which predictors are statistically significant (Lind, Marchal, and Wathen, 2006). For this
reason, we will ask for collinearity diagnostics when we run our regression. As a rule of thumb, if the
variance inflation factor (VIF) for a given predictor is very high or if the absolute value of the
correlation between two predictors is greater than .70, one or more of the predictors should be
dropped from the analysis, and the regression equation should be recomputed.
Multiple regression is in actuality a general family of techniques, and the mathematical and statistical
underpinnings of multiple regression make it an extremely powerful and flexible tool. By using group
membership or treatment level qualitative coding variables as predictors, one can easily use multiple
regression in place of t tests and analyses of variance. In this tutorial we will concentrate on the
simplest kind of multiple regression, a forced or simultaneous regression in which all predictor
variables are entered into the regression equation at one time. Other approaches include stepwise
regression in which variables are entered according to their predictive ability and hierarchical
regression in which variables are entered according to theory or hypothesis. We will examine
hierarchical regression more closely in Lesson 14 on analysis of covariance.
Example Data
SPSS TUTORIALS
Page
90
The following data (see Figure 12-1) represent statistics course grades, GRE Quantitative scores, and
cumulative GPAs for 32 graduate students at a large public university in the southern U.S. (source:
data collected by the webmaster). You may click here to retrieve a copy of the entire dataset.
Figure 12-1 Statistics course grades, GREQ, and GPA (partial data)
Preparing for the Regression Analysis
We will determine whether quantitative ability (GREQ) and cumulative GPA can be used to predict
performance in the statistics course. A very useful first step is to calculate the zero-order correlations
among the predictors and the criterion. We will use the Correlate procedure for that purpose. Select
Analyze, Correlate, Bivariate (see Figure 12-2).
SPSS TUTORIALS
Page
91
SPSS TUTORIALS
Page
92
SPSS TUTORIALS
Page
93
SPSS TUTORIALS
Page
94
SPSS TUTORIALS
Page
95
Click OK to run the regression analysis. The results are excerpted in Figure 12-8.
SPSS TUTORIALS
Page
96
SPSS TUTORIALS
Page
97
The normal p-p plot indicates some departure from normality and may suggest a curvilinear
relationship between the predictors and the criterion (see Figure 12-10).
SPSS TUTORIALS
Page
98
where O represents the observed frequency in a given cell of the table and E represents the
corresponding expected frequency under the null hypothesis.
We will illustrate both the goodness-of-fit test and the test of independence using the same dataset.
You will find the goodness of fit test for equal or unequal unexpected frequencies as an option under
Nonparametric Tests in the Analyze menu. For the chi-square test of independence, you will use the
Crosstabs procedure under the Descriptive Statistics menu in SPSS. The cross-tabulation procedure
can make use of numeric or text entries, while the Nonparametric Test procedure requires numeric
entries. For that reason, you will need to recode any text entries into numerical values for goodnessof-fit tests.
Example Data
Assume that you are interested in the effects of peer mentoring on student academic success in a
competitive private liberal arts college. A group of 30 students is randomly selected during their
freshman orientation. These students are assigned to a team of seniors who have been trained as tutors
in various academic subjects, listening skills, and team-building skills. The 30 selected students meet
in small group sessions with their peer tutors once each week during their entire freshman year, are
SPSS TUTORIALS
Page
99
encouraged to work with their small group for study sessions, and are encouraged to schedule private
sessions with their peer mentors whenever they desire. You identify an additional 30 students at
orientation as a control group. The control group members receive no formal peer mentoring. You
determine that there are no significant differences between the high school grades and SAT scores of
the two groups. At the end of four years, you compare the two groups on academic retention and
academic performance. You code mentoring as 1 = present and 0 = absent to identify the two groups.
Because GPAs differ by academic major, you generate a binary code for grades. If the student's
cumulative GPA is at the median or higher for his or her academic major, you assign a 1. Students
whose grades are below the median for their major receive a zero. If the student is no longer enrolled
(i.e., has transferred, dropped out, or flunked out), you code a zero for retention. If he or she is still
enrolled, but has not yet graduated after four years, you code a 1. If he or she has graduated, you code
a 2.
You collect the following (hypothetical) data:
Properly entered in SPSS, the data should look like the following (see Figure 13-1). For your
convenience, you may also download a copy of the dataset.
SPSS TUTORIALS
Page
100
SPSS TUTORIALS
Page
101
SPSS TUTORIALS
Page
102
SPSS TUTORIALS
Page
103
SPSS TUTORIALS
Page
104
SPSS TUTORIALS
Page
105
SPSS TUTORIALS
Page
106
SPSS TUTORIALS
Page
107
SPSS TUTORIALS
Page
108
SPSS TUTORIALS
Page
109
As a second precursor to the ANCOVA, let us determine the degree of correlation between
quantitative ability and exam scores. As correlation is the subject of Lesson 10, the details are omitted
here, and only the results are shown in Figure 14-2.
SPSS TUTORIALS
Page
110
SPSS TUTORIALS
Page
111
SPSS TUTORIALS
Page
112
Click Continue. If you like, you can click on Plots to add profile plots for the estimated marginal
means of the posttest scores of the two groups after adjusting for pretest scores. Click on OK to run
the analysis. The results are shown in Figure 14-6. The results indicate that after controlling for initial
quantitative ability, the differences in posttest scores are statistically significantly different between
the two groups, F(1,27)=16.64, p < .001, partial eta-squared = .381.
SPSS TUTORIALS
Page
113
SPSS TUTORIALS
Page
114
SPSS TUTORIALS
Page
115
SPSS TUTORIALS
Page
116
Click on Statistics, and check the box in front of R squared change (see Figure 14-11).
SPSS TUTORIALS
Page
117