Documentos de Académico
Documentos de Profesional
Documentos de Cultura
POLS 7014
1 / 17
Objectives
POLS 7014
2 / 17
2 = P 2
Xi
Our estimators for standard error of regression, standard error of 2 ,
and raw r 2 also change.
Notice the lack of centering in the estimator. We are now finding the
best fit line that also goes through the origin.
Many of the great mathematical properties we derived from the
model with the intercept are now gone. Under regression through the
origin, the sum of the residuals need not be zero.
This model is only used in special applications, and you have to be
100% sure that it is appropriate before using it.
Jamie Monogan (UGA)
POLS 7014
3 / 17
Rescaling Variables
In general, it is probably best to scale your variables accordingly
before you even estimate a model.
Before estimating a model with income as a variable, you should think
about what unit you want to use. $1? $1,000?
If you rescale the variable beforehand, then you can sensibly interpret
the results afterward. Just remember, the true 2 tells us for a
one-unit change in the input variable, how many units will the
outcome change on average. You have to know what a unit is,
though.
For example, suppose Xi is years of education and Yi is dollars of
income. How would you interpret the slope coefficient for this
estimated model?
Yi = 20000 + 1000Xi + ui
So get the scaling right beforehand and know your data in order to
interpret a model well.
Jamie Monogan (UGA)
POLS 7014
4 / 17
2 = w
w2 2
1 = w1 1
And formulas for error variance and r 2 also can be derived.
POLS 7014
5 / 17
Standardizing Variables
A common rescaling we might like to do is standardizing our variables.
The formula is simple:
Yi Y
Yi =
SY
X
i X
Xi =
SX
Each rescaled variable has a 0 mean and a standard deviation of 1.
Consequently the standardized regression model simplifies:
Yi = 1 + 2 Xi + ui
= 2 Xi + ui
In terminology that is awful but path dependent, we call the
coefficients from a standardized regression beta coefficients or beta
weights.
I recommend just referring to these coefficients as standardized
coefficients.
Jamie Monogan (UGA)
POLS 7014
6 / 17
POLS 7014
7 / 17
b logb x = x
logb (xy ) = logb (x) + logb (y )
logb ( yx ) = logb (x) logb (y )
logb (y x ) = x logb y
logb ( x1 ) = logb (x) b x = b1x
logb (1) = 0 b 0 = 1
a (x)
logb (x) = log
Let logb (x) = y . Then b y = x.
log (b)
a
loga (b y )
loga (x)
y loga (b) =
loga (x)
loga (x)
loga (b)
loga (x)
loga (b)
logb (x) =
Jamie Monogan (UGA)
POLS 7014
8 / 17
Logarithms
Base 10
POLS 7014
9 / 17
Natural Log
Base e
10
20
ex
ln(x)
30
40
50
20
40
60
80
100
POLS 7014
10 / 17
Examples:
log( 10) =
1
2
log(1) = 0
log(10) = 1
log(100) = 2
ln(1) = 0
ln(e) = 1
POLS 7014
11 / 17
POLS 7014
12 / 17
= e .910+.065(Xi +1)+ui
= e .910+.065Xi +ui e .065
Since e .065 = 1.067, we can say that Yi gets 1.067 times bigger for a
one unit increase in Xi on average.
For clearer interpretation, we can say: For each percentage point
more liberal a states population becomes, the ratio of welcoming to
hostile laws increases by 6.7% on average.
Jamie Monogan (UGA)
POLS 7014
13 / 17
Reciprocal Models
POLS 7014
14 / 17
Source:
Jamie Monogan (UGA)
POLS 7014
15 / 17
ui2
hi
.
2
(k + 1) (1 hi )2
Solutions
One option is to remove problematic observations and re-estimate the
model. Be very cautious if you take this approach, though.
Draper & Smith 1998: Outliers may convey information that other
data cannot. Outliers may warrant careful investigation. Only when
traced to recording error should outliers be rejected automatically.
Can you identify why an observation is an outlier, leverage, or
influence point? Such information can usually guide your decision.
Jamie Monogan (UGA)
POLS 7014
16 / 17
View the scatterplot with crime on the vertical axis and poverty on
the horizontal axis. What stands out? What is the best functional
form the model could take?
Do one of two things: (A) Estimate a model that is not linear in the
variables. (B) Estimate a linear-in-variables model, reporting both
unstandardized and standardized coefficients.
Diagnose your data for outliers, leverage, and influential data points.
Do you think it is fair to remove any observations? Why or why not?
If so, re-estimate your model without the influential observation(s).
Present the results of your model in journal-worthy table.
Graph the functional form of your model with the real data on a
scatterplot.
Jamie Monogan (UGA)
POLS 7014
17 / 17