Documentos de Académico
Documentos de Profesional
Documentos de Cultura
2014-15 Term 2
Assignment #1
Due: February 3th, 2014 (Tuesday) at 5:30pm
This assignment covers material from Chapter 1 and Section 2.1-2.3 of the lecture notes.
You need to show your calculation in details order to obtain full scores.
Problem 1 [35 points]: Suppose the following regression model is fitted to a data set with
observations {(xi, yi), i = 1, 2, , n}:
y x 3 e, e ~ N (0, 2 )
(a) Based on the least squares method and the fact that RSS ~ n21 (df = n-1 since df= n from
2
the data and df=1 from ), compute the least squares estimates and .
6
(c) Are the points ( x , y ) and x
1/ 3
n
, x 3 y i 1 xi6 / n
1/ 3
n
, i 1 xi3 yi / n on the fitted
regression line?
~
2
(d) Derive the maximum likelihood estimates (MLE) and ~ .
(e) Suppose (x1, x2, x3, x4, x5) = (1, 2, 3, 4, 5) and (y1, y2, y3, y4, y5) = (1, 3, 11, 26, 50).
2
Compute the values of the least squares estimates and . Does the sum of residuals
equal to zero?
Problem 2 [10 points]: Consider the residuals { ei } from the simple linear regression:
ei yi y i yi 0 1 xi ,
i = 1, 2, , n
1 n
( xi x )(ei e) 0 .
n 1 i 1
Page 1/2
Problem 3 (R problem) [20 points]: The R library alr3 contains the segreg data, which
contains the electricity consumption (in KWH) and mean temperature (in F) for one building
on the University of Minnesotas Twin Cities campus for 39 months in 1988-1992.
(http://www.stat.cmu.edu/~roeder/stat707/=data/=data/data/Rlibraries/alr3/html/segreg.html)
Suppose that we are interested in how the electricity consumption (y=segreg$C) is affected
by the monthly mean temperature (x=segreg$Temp), primarily driven by the use of air
conditioning.
(a) Based on similar R codes from page 23 in Ch2, obtain the OLS estimates 0 , 1 and 2 .
(b) Is there any outlier in the data set, if outlier is defined as observation (xi, yi) with
| ei | 2 ?
xi2 8718.558,
i 1
yi2 113.9961,
i 1
x y
i 1
929.8138 .
ar ( | X ) and V
ar ( 1 | X ) .
(c) Compute V
0
(d) Suppose that (x, y)=(48.462, 2.000) is one of the observations in the data set. Based on
the definition of outlier as in Problem 3(b), do you think the point is an outlier? Explain.
Suppose that the point (48.462, 2.000) is removed from the data set, and the new OLS
*
*
*
estimates 0 , 1 and
*
(e) Show that 1 = 0.12883.
*
* 2
(f) What are the OLS estimates 0 and ?
Page 2/2