Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Outline
• Where do demand functions come from?
• Sources of information for demand estimation
• Cross-sectional versus time series data
• Estimating a demand specification using the
ordinary least squares (OLS) method.
• Goodness of fit statistics.
The goal of forecasting
Regression analysis is a
statistical technique that allows
us to quantify the relationship
between a dependent variable
and one or more independent or
“explanatory” variables.
Y
Regression theory
0 X1 X2 X
We assume that
expected conditional values
of Y associated with
alternative values of X
fall on a line.
Y
E(Y |Xi) = 0 + 1Xi
Y1
1
E(Y|X1) 1 = Y1 - E(Y|X1)
0 X1 X
Specifying a single
variable model
Q is the dependent
variable—that is, we think
that variations in Q can be
explained by variations in
P, the “explanatory”
variable.
Estimating the single variable model
Since the data
points are unlikely to fall
exactly on a line, (1)
must be modified
Q i 0 1P i [1] to include a disturbance
term (εi)
Q i 0 1P i i [2]
ŷ b0 b1 x
Estimated
b0 and b1 Regression Equation
provide estimates of ŷ b0 b1 x
b0 and b1 Sample Statistics
b0, b1
Least Squares Method
Least Squares Criterion
min (y i y i ) 2
where:
yi = observed value of the dependent variable
for the ith observation
y^i = estimated value of the dependent variable
for the ith observation
Least Squares Method
Slope for the Estimated Regression
Equation
b1 ( x x )( y y )
i i
(x x )
i
2
Least Squares Method
b0 y b1 x
where:
xi = value of independent variable for ith
observation
yi = value of dependent variable for ith
_ observation
x = mean value for independent variable
_
y = mean value for dependent variable
n = total number of observations
Line of best fit
The line of best fit is the one
that minimizes the squared
sum of the vertical distances
of the sample points from the
line
The 4 steps of demand
estimation using regression
1. Specification
2. Estimation
3. Evaluation
4. Forecasting
Year and Average Number Average
Table 4-2
Quarter Coach Seats Fare
97-1 64.8 250 Ticket Prices and Ticket
97-2 33.6 265
97-3 37.8 265
Sales along an Air Route
97-4 83.3 240
98-1 111.7 230
98-2 137.5 225
98-3 109.6 225
98-4 96.8 220
99-1 59.5 230
99-2 83.2 235
99-3 90.5 245
99-4 105.5 240
00-1 75.7 250
00-2 91.6 240
00-3 112.7 240
00-4 102.2 235
Mean 87.3 239.7
Std. Dev. 27.9 13.1
Simple linear regression
begins by plotting Q-P
values on a scatter
diagram to determine if
there exists an
approximate linear
relationship:
Scatter plot diagram
290
280
270
260
Fare
250
240
230
220
210
20 40 60 80 100 120 140 160
Passengers
Scatter plot diagram with possible line of best fit
$ 27 0
26 0
25 0
24 0
23 0
22 0
0 50 100 150
Number of Seats Sold per Flight
Note that we use X to denote the explanatory
variable and Y is the dependent variable.
So in our example Sales (Q) is the “Y” variable
and Fares (P) is the “X” variable.
Q=Y
P=X
Computing the OLS
estimators
Standardi
zed
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 478.690 88.036 5.437 .000
FARE -1.633 .367 -.766 -4.453 .001
a. Dependent Variable: PASS
Reading the SPSS Output
From this table we see that
our estimate of 0 is 478.7
and our estimate of 1 is
–1.63.
Qˆ i 478.7 1.63Pi
Step 3: Evaluation
s ˆ 1 2
e i
2
and
n k x i
2
k is the number of
estimated coefficients
Coefficientsa
Standardi
zed
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 478.690 88.036 5.437 .000
FARE -1.633 .367 -.766 -4.453 .001
a. Dependent Variable: PASS
Y
n n
TSS i Y yi 2 Note: yi Yi Y
i 1 i 1
Yˆ
n 2 n
RSS
i 1
i Y i 1
yˆ i 2
What remains is the unexplained variation in the dependent
variable or the error sum of squares (ESS)
n 2 n
ESS Yi Yˆ ei 2
i 1 i 1
R2 is defined as:
n n
RSS ESS yˆ i 2
ei2
R 2
1 i1
n
1 i 1
n
TSS RSS
i1
yi2
i 1
yi2
ANOVAb
Sum of Mean
Model Squares df Square F Sig.
1 Regression 6863.624 1 6863.624 19.826 .001a
Residual 4846.816 14 346.201
Total 11710.440 15
a. Predictors: (Constant), FARE
b. Dependent Variable: PASS
Model Summary
Std. Error
Adjusted R of the
Model R R Square Square Estimate
1 .766a .586 .557 18.6065
a. Predictors: (Constant), FARE
i 1
ei 2
s
n k
Model Summary
Std. Error
Adjusted R of the
Model R R Square Square Estimate
1 .766a .586 .557 18.6065
a. Predictors: (Constant), FARE
Qˆ i 478.7 1.63Pi
140
120
Passengers
100
80
60
40 Actual
20 Fitted
97.1 97.3 98.1 98.3 99.1 99.3 00.1 00.3
Year/Quarter
Can we make a
good forecast?
• Does our model exhibit structural stability, i.e., will the causal
relationship between Q and P expressed in our forecasting equation
hold up over time? After all, the estimated coefficients are average
values for a specific time interval (1987-2001). While the past may be a
serviceable guide to the future in the case of purely physical
phenomena, the same principle does not necessarily hold in the realm
of social phenomena (to which economy belongs).
Single Variable Regression Using Excel
HP b0 b1Y
Scatter Diagram: Income and Home Prices
200
180
Home Prices
160
140
120
100
80
50 60 70 80 90 100 110
Income
Excel Output Regression Statistics
ANOVA Multiple R 0.906983447
df SS R Square 0.822618973
9355.71550 Adjusted R
Regression 1 2 Square 0.811532659
2017.36949 Standard Error 11.22878416
Residual 16 8 Observations 18
Total 17 11373.085
Standard
Coefficients Error t Stat