Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Y
Variable being
predicted
Response
variable
Dependent
variable
X
Predictor variable
Explanatory variable
Independent variable
Chapter 9: Summary notes: Multiple Regression
Simple linear model (Chapter 8):
Here we predict y using just one x variable.
Multiple linear model:
… .
With this model we predict y with several x variables all
weighted differently (this is like buying a used car, deciding
who to marry, deciding which university to go to).
We prefer to have fewer predictors in our model.
2 2
We use R or R adj as well as the p‐value for each
of the predictor coefficients to help us decide on
what combination of predictors will comprise the
best model.
[There are also polynomial models (section 9.4, p227) such as cubic and
quadratic; models that use log, exponential or reciprocal functions, models
with interactions between factors and more]
Output to be considered now:
Coefficients p‐value
Intercept a Never consider this
y‐intercept – just an “anchor point” one for throwing out
for our line.
X Variable 1 0.18
coefficient of , “slope” “gradient”
X Variable 2 0.006
coefficient of , “slope” “gradient”
X Variable 3 0.03
coefficient of , “slope” “gradient”
. . Significant if p< 0.05
. .
. .
. .
The first x‐variable
. . you would consider
. .
.
dropping is the one
with the highest p‐
value.
What would be the regression equation for the
following output?
Which variable would you consider dropping first?
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.974382534
R Square 0.949421322
Adjusted R Square 0.93993782
Standard Error 0.567418114
Observations 20
Coefficients P-value
Intercept -0.120567911 0.8830618
Customers 0.538637889 0.120238213
Load 0.061258883 0.271669653
Distance 0.022670421 0.007370326
2
What happens with r and R in a multiple linear model?
When there is a single predictor (x) in a linear model:
R2 represents how much of the variability in y (the
predicted or response variable) is explained by the
regression on x (the predictor or explanatory
variable).
When there are multiple predictors (x1 , x2 , x3 ....) in a
linear model:
R2 represents how much of the variability in y (the
predicted or response variable) is explained by the
regression on x1, x2, x3 ... (the predictor or
explanatory variables).
2
R for the multiple linear model is often much greater
2 2 2
than ( (R for y, x1) + (R for y, x2) + (R for y, x3) . . . )
since there is often a synergy between the predictor
variables that make the whole better than the sum of its
parts.
Example (p220 to 225)
Predictors of Correlation R2
salary (x) coefficient for y, x
Age only 0.6013
Qualification 0.0728
score(QS) only
Age and QS 0.5711
Finding the best model:
Be aware that the best model does not mean the
highest value of R2, as adding a variable cannot reduce
R2. We have to find a good compromise between
explaining as much as possible of the variability (i.e. high
R2) and not having too complicated a model involving
unnecessary variables. In general we should only
include additional variables in a model if they lead to a
worthwhile increase in R2.
R2adj
Adjusted R2 is always less than R2. It takes into account
the number of explanatory variables in the model to
avoid overestimating the effect on the explained
variability of adding a new variable (p234).
It is used in the “backwards elimination” procedure
described on p234 for finding the combination of
predictor variables that result in a reasonable model.
Multiple linear model incorporating a “dummy” variable to
represent some attribute of the data.
Here, if there are two attributes (say urban or rural; married
or single) we set up one of the x predictor variables as a
“dummy variable” and assign either zero or one as its value
according to which attribute applies (it does not matter
which gets zero and which gets one).
e.g.
(Here x2 is the dummy variable.)
This is only suitable if the relationship can be shown to
consist of two distinct parallel lines – otherwise it is better to
use two separate models. Remember, parallel lines have the
same slope or gradient, and in regression, this gradient is the
coefficient of x1, the non‐dummy variable.
Example for using a dummy variable.
(p238, Chapter 9 Exercises, Q1)
Look at the scatterplot ‐ this dummy variable method is
suitable as there are two distinct parallel lines, one for
males, and one for females.
Salary = 35.1 + 0.3 x months – 14.8 x gender
Since here we have used gender = 0 for males and gender = 1
for females (it does not matter which you choose) we can
write two separate equations:
Salary for males = 35.1 + 0.3 x months
Salary for females = 35.1 + 0.3 x months – 14.8
Interpretation:
35.1 on average this is the starting salary for males
0.3 For both male and female, salary increases by 0.3
for every month of experience on average. Since
the units are in thousands of dollars this is 0.3 x
$1000 = $300.
14.8 On average, female salaries are 14.8 (or $14,800)
lower than for a male with the same length of
service.