Assumption C.5 States That The Values of The Disturbance Term in The Observations in The Sample Are Generated Independently of Each Other

AUTOCORRELATION
1
Assumption C.5 states that the values of the disturbance term in the observations in the
sample are generated independently of each other.
1
Y
X

AUTOCORRELATION
2
In the graph above, it is clear that this assumption is not valid. Positive values tend to be
followed by positive ones, and negative values by negative ones. Successive values tend
to have the same sign. This is described as positive autocorrelation.
1
Y
X

AUTOCORRELATION
3
In this graph, positive values tend to be followed by negative ones, and negative values by
positive ones. This is an example of negative autocorrelation.
Y
1
X

First-order autoregressive autocorrelation: AR(1)

AUTOCORRELATION
t t t
u u c + =
1
8
t t t
u X Y + + =
2 1
| |
A particularly common type of autocorrelation, at least as an approximation, is first-order
autoregressive autocorrelation, usually denoted AR(1) autocorrelation.


AUTOCORRELATION
t t t
u u c + =
1
8
t t t
u X Y + + =
2 1
| |
It is autoregressive, because u
t
depends on lagged values of itself, and first-order, because
it depends only on its previous value. u
t
also depends on c
t
, an injection of fresh
randomness at time t, often described as the innovation at time t.


Fifth-order autoregressive autocorrelation: AR(5)

AUTOCORRELATION
t t t
u u c + =
1
t t t t t t t
u u u u u u c + + + + + =
5 5 4 4 3 3 2 2 1 1
8
t t t
u X Y + + =
2 1
| |
Here is a more complex example of autoregressive autocorrelation. It is described as fifth-
order, and so denoted AR(5), because it depends on lagged values of u
t
up to the fifth lag.



Third-order moving average autocorrelation: MA(3)

AUTOCORRELATION
t t t
u u c + =
1
t t t t t t t
u u u u u u c + + + + + =
5 5 4 4 3 3 2 2 1 1
3 3 2 2 1 1 0
+ + + =
t t t t t
u c c c c
8
t t t
u X Y + + =
2 1
| |
The other main type of autocorrelation is moving average autocorrelation, where the
disturbance term is a linear combination of the current innovation and a finite number of
previous ones.



Third-order moving average autocorrelation: MA(3)

AUTOCORRELATION
t t t
u u c + =
1
t t t t t t t
u u u u u u c + + + + + =
5 5 4 4 3 3 2 2 1 1
3 3 2 2 1 1 0
+ + + =
t t t t t
u c c c c
8
This example is described as third-order moving average autocorrelation, denoted MA(3),
because it depends on the three previous innovations as well as the current one.
t t t
u X Y + + =
2 1
| |
AUTOCORRELATION
9
We will now look at examples of the patterns that are generated when the disturbance term
is subject to AR(1) autocorrelation. The object is to provide some bench-mark images to
help you assess plots of residuals in time series regressions.
-3
-2
-1
0
1
2
3
1
t t t
u u c + =
1
AUTOCORRELATION
10
We will use 50 independent values of c, taken from a normal distribution with 0 mean, and
generate series for u using different values of .
-3
-2
-1
0
1
2
3
1
t t t
u u c + =
1
AUTOCORRELATION
11
We have started with equal to 0, so there is no autocorrelation. We will increase
progressively in steps of 0.1.
-3
-2
-1
0
1
2
3
1
t t t
u u c + =
1
0 . 0
AUTOCORRELATION
12
-3
-2
-1
0
1
2
3
1
t t t
u u c + =
1
1 . 0
AUTOCORRELATION
13
-3
-2
-1
0
1
2
3
1
t t t
u u c + =
1
2 . 0
AUTOCORRELATION
14
With equal to 0.3, a pattern of positive autocorrelation is beginning to be apparent.
t t t
u u c + =
1
3 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
15
t t t
u u c + =
1
4 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
16
t t t
u u c + =
1
5 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
17
With equal to 0.6, it is obvious that u is subject to positive autocorrelation. Positive
values tend to be followed by positive ones and negative values by negative ones.
t t t
u u c + =
1
6 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
18
t t t
u u c + =
1
7 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
19
t t t
u u c + =
1
8 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
20
With equal to 0.9, the sequences of values with the same sign have become long and the
tendency to return to 0 has become weak.
t t t
u u c + =
1
9 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
21
The process is now approaching what is known as a random walk, where is equal to 1 and
the process becomes nonstationary. The terms random walk and nonstationary will be
defined in the next chapter. For the time being we will assume | | < 1.
t t t
u u c + =
1
95 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
22
Next we will look at negative autocorrelation, starting with the same set of 50 independently
distributed values of c
t
.
-3
-2
-1
0
1
2
3
1
t t t
u u c + =
1
0 . 0
AUTOCORRELATION
23
We will take larger steps this time.
t t t
u u c + =
1
3 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
24
With equal to 0.6, you can see that positive values tend to be followed by negative ones,
and vice versa, more frequently than you would expect as a matter of chance.
t t t
u u c + =
1
6 . 0
-3
-2
-1
0
1
2
3
1
AUTOCORRELATION
25
Now the pattern of negative autocorrelation is very obvious.
t t t
u u c + =
1
9 . 0
-3
-2
-1
0
1
2
3
1

============================================================
Dependent Variable: LGHOUS
Method: Least Squares
Sample: 1959 2003
Included observations: 45
============================================================
Variable Coefficient Std. Error t-Statistic Prob.
============================================================
C 0.005625 0.167903 0.033501 0.9734
LGDPI 1.031918 0.006649 155.1976 0.0000
LGPRHOUS -0.483421 0.041780 -11.57056 0.0000
============================================================
R-squared 0.998583 Mean dependent var 6.359334
Adjusted R-squared 0.998515 S.D. dependent var 0.437527
S.E. of regression 0.016859 Akaike info criter-5.263574
Sum squared resid 0.011937 Schwarz criterion -5.143130
Log likelihood 121.4304 F-statistic 14797.05
Durbin-Watson stat 0.633113 Prob(F-statistic) 0.000000
============================================================
AUTOCORRELATION
26
Next, we will look at a plot of the residuals of the logarithmic regression of expenditure on
housing services on income and relative price.

AUTOCORRELATION
27
This is the plot of the residuals of course, not the disturbance term. But if the disturbance
term is subject to autocorrelation, then the residuals will be subject to a similar pattern of
autocorrelation.
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003

AUTOCORRELATION
28
You can see that there is strong evidence of positive autocorrelation. Comparing the graph
with the randomly generated patterns, one would say that is about 0.7 or 0.8.
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
1959 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003
1
STATIONARY PROCESSES
In this slideshow we will define what is meant by a stationary time series process. We will
begin with a very simple example, the AR(1) process X
t
= |
2
X
t1
+ c
t
where |
2
< 1 and c
t
is
iidindependently and identically distributedwith zero mean and finite variance.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
2
As noted in Chapter 11, we make a distinction between the potential values {X
1
, ..., X
T
},
before the sample is generated, and a realization of actual values {x
1
, ..., x
T
}. Statisticians
write the potential values in upper case, and the actual values of a particular realization in
lower case, as we have done here, to emphasize the distinction.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
3
The figure shows an example of a realization starting with X
0
= 0, with |
2
= 0.8 and the
innovation et being drawn randomly for each time period from a normal distribution with
zero mean and unit variance.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
4
Because history cannot repeat itself, we will only ever see one realization of a time series
process. Nevertheless, it is meaningful to ask whether we can determine the potential
distribution of X at time t, given information at some earlier period, for example, time 0.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
5
As usual, there are two approaches to answering this question: mathematical analysis and
simulation. We shall do both for the time series process represented by the figure, starting
with a simulation.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
6
The figure shows 50 realizations of the process.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
7
For the first few periods, the distribution of the realizations at time t is affected by the fact
that they have a common starting point of 0. However, the initial effect soon becomes
unimportant and the distribution becomes stable from one period to the next.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
8
The figure presents a histogram of the values of X
20
. Apart from the first few time points,
histograms for other time points would look similar. If the number of realizations were
increased, each histogram would converge to the normal distribution shown in Figure 13.3.
0.00
0.10
0.20
0.30
5 to 4 4 to 3 3 to 2 2 to 1 1 to 0 0 to 1 1 to 2 2 to 3 3 to 4 4 to 5
Histogram of ensemble distribution of X
2H
9
The AR(1) process X
t
= |
2
X
t1
+ c
t
, with |2 < 1, is said to be stationary, the adjective
referring, not to X
t
itself, but to the potential distribution of its realizations, ignoring
transitory initial effects.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
10
X
t
itself changes from period to period, but the potential distribution of its realizations at
any given time point does not.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
11
The potential distribution at time t is described as the ensemble distribution at time t, to
emphasize the fact that we are talking about the distribution of a cross-section of
realizations, not the ordinary distribution of a random variable.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
12
In general, a time series process is said to be stationary if its ensemble distribution satisfies
three conditions:
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
13
1. The population mean of the distribution is independent of time.
2. The population variance of the distribution is independent of time.
3. The population covariance between its values at any two time points depends only on
the distance between those points, and not on time.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
14
This definition of stationarity is known as weak stationarity or covariance stationarity. For
the definition of strong stationarity, (1) and (2) are replaced by the condition that the whole
potential distribution is independent of time.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
15
Our analysis will be unaffected by using the weak definition, and in any case the distinction
disappears when, as in the present example, the limiting distribution is normal.
-10
-5
0
5
10
0 10 20 30 40 50
8 . 0
2
= |
t t t
X X c | + =
1 2
( ) 1 , 0 ~ N
t
c
16
We will check that the process represented by X
t
= |
2
X
t1
+ c
t
, with |
2
< 1, satisfies the three
conditions for stationarity. First, if the process is valid for time period t, it is also valid for
time period t 1.
t t t
X X c | + =
1 2
Conditions for weak stationarity:
3. The population covariance between its values at any two time points depends only
on the distance between those points, and not on time.
1 1
2
< < |
1 2 2 1
+ =
t t t
X X c |
17
Substituting into the original model, one has X
t
in terms of X
t2
, c
t
, and c
t1
.

t t t
X X c | + =
1 2
t t t t
X X c c | | + + =
1 2 2
2
2
1 1
2
< < |
1 2 2 1
+ =
t t t
X X c |
18
Lagging and substituting t 1 times, one has X
t
in terms of X
0
and all the innovations c
1
, ...,
c
t
from period 1 to period t.

t t t
X X c | + =
1 2
t t t t
X X c c | | + + =
1 2 2
2
2
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
( )
0 2
X X E
t
t
| =
1 1
2
< < |
1 2 2 1
+ =
t t t
X X c |
19
Hence E(X
t
) = |
2
t
X
0
since the expectation of every innovation is zero. In the special case X
0

= 0, we then have E(X
t
) = 0. Since the expectation is not a function of time, the first
condition is satisfied.
t t t
X X c | + =
1 2
t t t t
X X c c | | + + =
1 2 2
2
2
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
( )
0 2
X X E
t
t
| =
1 1
2
< < |
1 2 2 1
+ =
t t t
X X c |
20
If X
0
is nonzero, |
2
t
tends to zero as t becomes large since |
2
<1. Hence |
2
t
X
0
will tend to
zero and the first condition will still be satisfied, apart from initial effects.
t t t
X X c | + =
1 2
t t t t
X X c c | | + + =
1 2 2
2
2
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
( )
0 2
X X E
t
t
| =
1 1
2
< < |
1 2 2 1
+ =
t t t
X X c |
( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
2
2
2
2
2
2 2
2
4
2
1 2
2
2 2 2
2
2 4
2
2 1 2
2
1 2 2
2
2 1
1
2
1 2 2
2
2 1
1
2 0 2
1
1
1 ...
...
var var var ... var
... var var
c c
c c c c
o
|
|
o | | |
o o | o | o |
c c | c | c |
c c | c | c | |
|
|
.
|
\
|
= + + + + =
+ + + + =
+ + + + =
+ + + + + =
t
t
t
t t t
t
t t t
t t
t
X X
21
Next, we have to show that the variance is also not a function of time. The first term in the
variance expression, |
2
t
X
0
, can be dropped because it is a constant, using variance rule 4
from the Review Chapter.
1. The mean of the distribution is independent of time.
2. The variance of the distribution is independent of time.
3. The covariance between its values at any two time points depends only on the
distance between those points, and not on time.
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
2
2
2
2
2
2 2
2
4
2
1 2
2
2 2 2
2
2 4
2
2 1 2
2
1 2 2
2
2 1
1
2
1 2 2
2
2 1
1
2 0 2
1
1
1 ...
...
var var var ... var
... var var
c c
c c c c
o
|
|
o | | |
o o | o | o |
c c | c | c |
c c | c | c | |
|
|
.
|
\
|
= + + + + =
+ + + + =
+ + + + =
+ + + + + =
t
t
t
t t t
t
t t t
t t
t
X X
22
The variance expression can be decomposed as the sum of the variances, using variance
rule 1 from the Review chapter and the fact that the covariances are all zero. (The
innovations are assumed to be generated independently.)
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
2
2
2
2
2
2 2
2
4
2
1 2
2
2 2 2
2
2 4
2
2 1 2
2
1 2 2
2
2 1
1
2
1 2 2
2
2 1
1
2 0 2
1
1
1 ...
...
var var var ... var
... var var
c c
c c c c
o
|
|
o | | |
o o | o | o |
c c | c | c |
c c | c | c | |
|
|
.
|
\
|
= + + + + =
+ + + + =
+ + + + =
+ + + + + =
t
t
t
t t t
t
t t t
t t
t
X X
23
In the third line, the constants are squared when taken out of the variance expressions,
using variance rule 2.
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
24
The final line involves the standard summation of a geometric progression.
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
2
2
2
2
2
2 2
2
4
2
1 2
2
2 2 2
2
2 4
2
2 1 2
2
1 2 2
2
2 1
1
2
1 2 2
2
2 1
1
2 0 2
1
1
1 ...
...
var var var ... var
... var var
c c
c c c c
o
|
|
o | | |
o o | o | o |
c c | c | c |
c c | c | c | |
|
|
.
|
\
|
= + + + + =
+ + + + =
+ + + + =
+ + + + + =
t
t
t
t t t
t
t t t
t t
t
X X
1 ...
2
2
4 2
2
2 2
2
+ + + + =

| | |
t t
Z
2
2
2 2
2
2
2
2
2
... | | | | + + + =
t t
Z
t
Z
2
2
2
2
1 ) 1 ( | | =
25
Given that |
2
< 1, |
2
2t
tends to zero as t becomes large. Thus, ignoring transitory initial
effects, Thus the variance tends to a limit that is independent of time.
t t t
t t
t
X X c c | c | c | | + + + + + =

1 2 2
2
2 1
1
2 0 2
...
( )
2
2
2
2
2
2
2
2
1
1
1
1
var
c c
o
|
o
|
|
|
.
|
\
|
|
|
.
|
\
|
=
t
t
X
26
This is the variance of the ensemble distribution shown in the figures.
-10
-5
0
5
10
0 10 20 30 40 50
0.00
0.10
0.20
0.30
5 to 4 4 to 3 3 to 2 2 to 1 1 to 0 0 to 1 1 to 2 2 to 3 3 to 4 4 to 5
( )
2
2
2
2
2
2
2
2
1
1
1
1
var
c c
o
|
o
|
|
|
.
|
\
|
|
|
.
|
\
|
=
t
t
X
27
It remains for us to demonstrate that the covariance between X
t
and X
t+s
is independent of
time. If the relationship is valid for time period t, it is also valid for time period t+s.
s t s t s t
X X
+ + +
+ = c |
1 2
t t t
X X c | + =
1 2
28
s t s t s t
X X
+ + +
+ = c |
1 2
t t t
X X c | + =
1 2
Lagging and substituting, we can express X
t+s
in terms of X
t+s2
and the innovations c
t+s1

and c
t+s
.

s t s t s t s t
X X
+ + + +
+ + = c c | |
1 2 2
2
2
29
s t s t s t
X X
+ + +
+ = c |
1 2
s t s t s t t
s
t
s
s t
X X
+ + + +
+
+ + + + + = c c | c | c | |
1 2 2
2
2 1
1
2 2
...
t t t
X X c | + =
1 2
Lagging and substituting s times, we can express X
t+s
in terms of X
t
t+1
,
..., c
t+s
.

s t s t s t s t
X X
+ + + +
+ + = c c | |
1 2 2
2
2
30
s t s t s t t
s
t
s
s t
X X
+ + + +
+
+ + + + + = c c | c | c | |
1 2 2
2
2 1
1
2 2
...
( ) ( )
| | ( )
( )
t
s
s t s t s t t
s
t
t
s
t s t t
X
X
X X X X
var
... , cov
, cov , cov
2
1 2 2
2
2 1
1
2
2
|
c c | c | c |
|
=
+ + + + +
=
+ + + +
+
Then the covariance between X
t
and X
t+s
is given by the expression shown. The second
term on the right side is zero because X
t
is independent of the innovations after time t.
31
The first term can be written |
2
s
var(X
t
). As we have just seen, var(X
t
) is independent of t,
apart from a transitory initial effect. Hence the third condition for stationarity is also
satisfied.
s t s t s t t
s
t
s
s t
X X
+ + + +
+
+ + + + + = c c | c | c | |
1 2 2
2
2 1
1
2 2
...
( ) ( )
| | ( )
( )
t
s
s t s t s t t
s
t
t
s
t s t t
X
X
X X X X
var
... , cov
, cov , cov
2
1 2 2
2
2 1
1
2
2
|
c c | c | c |
|
=
+ + + + +
=
+ + + +
+
32
t t t
X X c | | + + =
1 2 1
Suppose next that the process includes an intercept |
1
. How does this affect its properties?
Is it still stationary?
33
( )
t t t t
X X c c | | | | + + + + =
1 2 2
2
2 2 1
1
t t t
X X c | | + + =
1 2 1
t
in terms of X
t2
t
and c
t1
.
34
( )
t t t t
X X c c | | | | + + + + =
1 2 2
2
2 2 1
1
t t t
X X c | | + + =
1 2 1
Lagging and substituting t times, we can express X
t
in terms of X
0
1
, ...,
c
t
.
( )
t t t
t
t
t
t t t
t t t
t
X
X X
c c | c | c |
|
|
| |
c c | c | c | | | | |
+ + + + +
+ =
+ + + + + + + + + =

1 2 2
2
2 1
1
2
2
2
1 0 2
1 2 2
2
2 1
1
2 2
1
2 1 0 2
...
1
1
... 1 ...
35
( )
t t t t
X X c c | | | | + + + + =
1 2 2
2
2 2 1
1
t t t
X X c | | + + =
1 2 1
( )
t t t
t
t
t
t t t
t t t
t
X
X X
c c | c | c |
|
|
| |
c c | c | c | | | | |
+ + + + +
+ =
+ + + + + + + + + =

1 2 2
2
2 1
1
2
2
2
1 0 2
1 2 2
2
2 1
1
2 2
1
2 1 0 2
...
1
1
... 1 ...
( )
2
1
2
2
1 0 2
1 1
1
|
|
|
|
| |
+ =
t
t
t
X X E
Taking expectations, E(X
t
) tends to |
1
/(1 |
2
) since the term |
2
t
tends to zero. Thus the
expectation is now non-zero, but it remains independent of time.
36
t t t
t
t
t
t
X X c c | c | c |
|
|
| | + + + + +
+ =

1 2 2
2
2 1
1
2
2
2
1 0 2
...
1
1
The variance is unaffected by the addition of a constant in the expression for X
t
(variance
rule 4). Thus it remains independent of time, apart from initial effects.
( )
( )
2
2
2
2
2
2
2
2
1 2 2
2
2 1
1
2
1 2 2
2
2 1
1
2
2
2
1 0 2
1 1
1
... var
...
1
1
var var
|
o
o
|
|
c c | c | c |
c c | c | c |
|
|
| |
c
c
|
|
.
|
\
|
=
+ + + + =
|
|
.
|
\
|
+ + + + +
+ =

t
t t t
t
t t t
t
t
t
t
X X
37
Finally, we need to consider the covariance of X
t
and X
t+s
. If the relationship is valid for time
period t, it is also valid for time period t+s.
s t s t s t
X X
+ + +
+ + = c | |
1 2 1
t t t
X X c | | + + =
1 2 1
38
t+s
in terms of X
t+s1
, the innovations c
t+s1
and
..., c
t+s
, and a term involving |
1
.
( )
s t s t s t s t
X X
+ + + +
+ + + + = c c | | | |
1 2 2
2
2 2 1
1
s t s t s t
X X
+ + +
+ + = c | |
1 2 1
t t t
X X c | | + + =
1 2 1
39
Lagging and substituting s times, we can express X
t+s
in terms of X
t
, the innovations c
t+1
, ...,
c
t+s
, and a term involving |
1
.
( )
s t s t s t t
s
t
s s
s t
X X
+ + + +
+
+ + + + +
+ + + + + =
c c | c | c |
| | | | |
1 2 2
2
2 1
1
2
2 2
2
2
1
2 1
...
1 ...
( )
s t s t s t s t
X X
+ + + +
+ + + + = c c | | | |
1 2 2
2
2 2 1
1
s t s t s t
X X
+ + +
+ + = c | |
1 2 1
t t t
X X c | | + + =
1 2 1
40
The covariance of X
t
and X
t+s
is not affected by the inclusion of this term because it is a
constant. Hence the covariance is the same as before and remains independent of t.
( )
s t s t s t t
s
t
s s
s t
X X
+ + + +
+
+ + + + +
+ + + + + =
c c | c | c |
| | | | |
1 2 2
2
2 1
1
2
2 2
2
2
1
2 1
...
1 ...
( )
s t s t s t s t
X X
+ + + +
+ + + + = c c | | | |
1 2 2
2
2 2 1
1
s t s t s t
X X
+ + +
+ + = c | |
1 2 1
t t t
X X c | | + + =
1 2 1
41
We have seen that the process X
t
= |
1
+ |
2
X
t1
+ c
t
has a limiting ensemble distribution with
mean |
1
/(1 |
2
) and variance o
c
2
/ (1 |
2
2
). However, the process exhibits transient time-
dependent initial effects associated with the starting point X
0
.
t t t
X X c | | + + =
1 2 1
t t t
t
t
t
t
X X c c | c | c |
|
|
| | + + + + +
+ =

1 2 2
2
2 1
1
2
2
2
1 0 2
...
1
1
( )
2
1
2
2
1 0 2
1 1
1
|
|
|
|
| |
+ =
t
t
t
X X E
( )
2
2
2
2
2
2
2
2
1
1
1
1
var
c c
o
|
o
|
|
|
.
|
\
|
|
|
.
|
\
|
=
t
t
X
42
We can get rid of the transient effects by determining X
0
as a random draw from the
ensemble distribution. c
0
is a random draw from the distribution of c at time zero.
(Checking that X
0
has the ensemble mean and variance is left as an exercise.)
t t t
X X c | | + + =
1 2 1
t t t
t
t
t
t
X X c c | c | c |
|
|
| | + + + + +
+ =

1 2 2
2
2 1
1
2
2
2
1 0 2
...
1
1
( )
0
2
2 2
1
0
1
1
1
c
| |
|
= X
43
If we determine X
0
in this way, the expectation and variance of the process both become
strictly independent of time.
t t t
X X c | | + + =
1 2 1
t t t
t
t
t
t
X X c c | c | c |
|
|
| | + + + + +
+ =

1 2 2
2
2 1
1
2
2
2
1 0 2
...
1
1
( )
0
2
2 2
1
0
1
1
1
c
| |
|
= X
44
Substituting for X
0
, X
t
is equal to |
1
/(1 |
2
) plus a linear combination of the innovations c
0
,
..., c
t
.
t t t
X X c | | + + =
1 2 1
t t t
t
t
t
t
X X c c | c | c |
|
|
| | + + + + +
+ =

1 2 2
2
2 1
1
2
2
2
1 0 2
...
1
1
( )
0
2
2 2
1
0
1
1
1
c
| |
|
= X
( )
( )
t t
t t
t t
t
t
t
t
X
c c | c | c
|
|
|
|
c c | c |
|
|
| c
| |
|
|
+ + + +
=
+ + + +
+
|
|
.
|
\
|
1 2 1
1
2 0
2
2
2
2
1
1 2 1
1
2
2
2
1 0
2
2 2
1
2
...
1
1
1
...
1
1
1
1
1
45
Hence E(X
t
) is a constant and strictly independent of t for all t.
( )
t t
t t
t
X c c | c | c
|
|
|
|
+ + + +
1 2 1
1
2 0
2
2
2
2
1
...
1
1
1
( )
2
1
1 |
|
=
t
X E
46
The right side of the equation can be decomposed as the sum of the variances because all
the covariances are zero, the innovations being generated independently. As always
(variance rule 2), the multiplicative constants are squared in the decomposition.
( )
( )
( )
2
2
2
2
2
2
2
2
2
2
2
2
2
1 2 1
1
2 0
2
2
2
1 1
1
1
...
1
1
var var
|
o
o
|
|
o
|
|
c c | c | c
|
|
c
c c
=
|
|
.
|
\
|
=
|
|
.
|
\
|
+ + + +
t t
t t
t t
t
X
( )
t t
t t
t
X c c | c | c
|
|
|
|
+ + + +
1 2 1
1
2 0
2
2
2
2
1
...
1
1
1
47
The sum of the variances attributable to the innovations c
1
, ..., c
t
has already been derived
above. Taking account of the variance of c
0
, the total is now strictly independent of time.
( )
( )
( )
2
2
2
2
2
2
2
2
2
2
2
2
2
1 2 1
1
2 0
2
2
2
1 1
1
1
...
1
1
var var
|
o
o
|
|
o
|
|
c c | c | c
|
|
c
c c
=
|
|
.
|
\
|
=
|
|
.
|
\
|
+ + + +
t t
t t
t t
t
X
( )
t t
t t
t
X c c | c | c
|
|
|
|
+ + + +
1 2 1
1
2 0
2
2
2
2
1
...
1
1
1
48
The figure shows 50 realizations with X
0
treated in this way. This is the counterpart of the
ensemble distribution shown near the beginning of this sequence, with |
2
= 0.8 as in that
figure. As can be seen, the initial effects have disappeared.
-5
0
5
10
15
0 10 20 30 40 50
t t t
X X c + + =
1
8 . 0 0 . 1
49
The other difference in the figures results from the inclusion of a nonzero intercept. In the
earlier figure, |
1
= 0. In this figure, |
1
= 1.0 and the mean of the ensemble distribution is
|
1
/ (1 |
2
) = 1 / (1 0.8) = 5.
-5
0
5
10
15
0 10 20 30 40 50
t t t
X X c + + =
1
8 . 0 0 . 1
50
Which is the more appropriate assumption: X
0
fixed or X
0
a random draw from the ensemble
distribution? If the process really has started at time 0, then X
0
= 0 is likely to be the
obvious choice.
-5
0
5
10
15
0 10 20 30 40 50
t t t
X X c + + =
1
8 . 0 0 . 1
51
However, if the sample of observations is a time slice from a series that had been
established well before the time of the first observation, then it will usually make sense to
treat X
0
as a random draw from the ensemble distribution.
-5
0
5
10
15
0 10 20 30 40 50
t t t
X X c + + =
1
8 . 0 0 . 1
52
As will be seen in another sequence, evaluation of the power of tests for nonstationarity can
be sensitive to the assumption regarding X
0
, and typically the most appropriate way of
characterizing a stationary process is to avoid transient initial effects by treating X
0
as a
random draw from the ensemble distribution.
-5
0
5
10
15
0 10 20 30 40 50
t t t
X X c + + =
1
8 . 0 0 . 1

Stationary process

NONSTATIONARY PROCESSES
1
In the last sequence, the process shown at the top was shown to be stationary. The
expected value and variance of X
t
were shown to be (asymptotically) independent of time
and the covariance between X
t
and X
t+s
was also shown to be independent of time.
1 1
2 1 2
< < + =

| c |
t t t
X X
0 ) (
0 2
= X X E
t
t
|
2
2
2
2
2
2
2
2
2
1
1
1
1
c c
o
|
o
|
|
o
=
t
X
t
( )
2
2
2
2
1
, cov
c
o
|
|
=
+
s
s t t
X X
2
The condition 1 < |
2
< 1 was crucial for stationarity. Suppose |
2
= 1, as above. Then the
value of X in one time period is equal to its value in the previous time period, plus a random
adjustment. This is known as a random walk.
t t t
X X c + =
1

Random walk

3
The figure shows an example realization of a random walk for the case where c
t
has a
normal distribution with zero mean and unit variance.
-20
-10
0
10
20
0 10 20 30 40 50
t t t
X X c + =
1
( ) 1 , 0 ~ N
t
c

4
This figure shows the results of a simulation with 50 realizations. It is obvious that the ensemble
distribution is not stationary. The distribution changes as t increases, becoming increasingly
spread out. We will confirm this mathematically.
-20
-10
0
10
20
0 10 20 30 40 50
t t t
X X c + =
1
( ) 1 , 0 ~ N
t
c
5
If the process is valid for time t, it is valid for time t1.

Random walk

t t t
X X c + =
1
1 2 1
+ =
t t t
X X c
6
Hence X
t
can be expressed in terms of X
t2
t1
and c
t
.

Random walk

t t t
X X c + =
1
1 2 1
+ =
t t t
X X c
t t t t
X X c c + + =
1 2
7
t t t
X X c c c + + + + =
1 1 0
...

Random walk

t t t
X X c + =
1
1 2 1
+ =
t t t
X X c
t t t t
X X c c + + =
1 2
Thus, continuing to lag and substitute, X
t
is equal to its value at time 0, X
0
, plus the sum of
the innovations in periods 1 to t.
8
t t t
X X c c c + + + + =
1 1 0
...

Random walk

t t t
X X c + =
1
0 1 0
) ( ... ) ( ) ( X E E X X E
n t
= + + + = c c
If expectations are taken at time 0, the expected value at any future time t is fixed at X
0

because the expected values of the future innovations are all 0. Thus E(X
t
) is independent
of t and the first condition for stationarity remains satisfied.

9
This can be seen from the figure with 50 realizations. The distribution of the values of X
t

spreads out as t increases, but there is no tendency for the mean of the distribution to
change. (In this example X
0
= 0, but this is unimportant. It would be true for any value of
X
0
.)
-20
-10
0
10
20
0 10 20 30 40 50
t t t
X X c + =
1
( ) 1 , 0 ~ N
t
c

10
However, it is also clear from the figure that the ensemble distribution is not constant over
time, and therefore that the process is nonstationary. The distribution of the values of X
t

spreads out as t increases, so the variance of the distribution is an increasing function of t.
-20
-10
0
10
20
0 10 20 30 40 50
t t t
X X c + =
1
( ) 1 , 0 ~ N
t
c
11
t t t
X X c c c + + + + =
1 1 0
...

Random walk

t t t
X X c + =
1
We will demonstrate this mathematically. We have seen that X
t
is equal to X
0
plus the sum
of the innovations c
1
, ..., c
t
. X
0
is an additive constant, so it does not affect the variance.
( )
( )
2 2 2 2
1 1
1 1 0
2
...
... var
... var
c c c c
o o o o
c c c
c c c o
t
X
t t
t t X
t
= + + + =
+ + + =
+ + + + =
12
t t t
X X c c c + + + + =
1 1 0
...

Random walk

t t t
X X c + =
1
( )
( )
2 2 2 2
1 1
1 1 0
2
...
... var
... var
c c c c
o o o o
c c c
c c c o
t
X
t t
t t X
t
= + + + =
+ + + =
+ + + + =
The variance of the sum of the innovations is equal to the sum of their individual variances.
The covariances are all zero because the innovations are assumed to be generated
independently.
13
t t t
X X c c c + + + + =
1 1 0
...

Random walk

t t t
X X c + =
1
( )
( )
2 2 2 2
1 1
1 1 0
2
...
... var
... var
c c c c
o o o o
c c c
c c c o
t
X
t t
t t X
t
= + + + =
+ + + =
+ + + + =
The variance of each innovation is equal to o

c
, by assumption. Hence the population
variance of X
t
is directly proportional to t. As we have seen from the figure, its distribution
spreads out as t increases.

Stationary process

14
1 1
2 1 2 1
< < + + =

| c | |
t t t
X X
2
2
2
2
2
2
2
2
2
1
1
1
1
c c
o
|
o
|
|
o
=
t
X
t
( )
2
2
2
2
1
, cov
c
o
|
|
=
+
s
s t t
X X
2
1
0 2 1
2
2
1 1
1
) (
|
|
| |
|
|
= X X E
t
t
t
A second process considered in the last sequence is shown above. The presence of the
intercept |
1
on the right side gave the series a nonzero mean but did not lead to a violation
of the conditions for stationarity.
15
If |
2
= 1, however, the series becomes a nonstationary process known as a random walk
with drift.
t t t
X X c | + + =
1 1

Random walk with drift

16
t t t
X X c | + + =
1 1


1 2 1 1
+ + =
t t t
X X c |
If the process is valid for time t, it is valid for time t1.
17
t t t
X X c | + + =
1 1


1 2 1 1
+ + =
t t t
X X c |
t t t t
X X c c | + + + =
1 2 1
2
Hence X
t
can be expressed in terms of X
t2
, the innovations c
t1
and c
t
, and an intercept. The
intercept is 2|
1
. Irrespective of whatever else is happening to the process. a fixed quantity
|
1
is added in every time period.
18
t t t
X X c | + + =
1 1


t t t
X t X c c c | + + + + + =
1 1 0 1
...
1 2 1 1
+ + =
t t t
X X c |
t t t t
X X c c | + + + =
1 2 1
2
Thus, lagging and substituting t times, X
t
is now equal X
0
plus the sum of the innovations,
as before, plus the constant |
1
multiplied by t.
19
t t t
X X c | + + =
1 1


t t t
X t X c c c | + + + + + =
1 1 0 1
...
As a consequence, the mean of the process becomes a function of time, violating the first
condition for stationarity.
t X X E
t 1 0
) ( | + =
20
t t t
X X c | + + =
1 1


t t t
X t X c c c | + + + + + =
1 1 0 1
...
t X X E
t 1 0
) ( | + =
(The second condition for nonstationarity remains violated since the variance of the
distribution of X
t
is proportional to t. It is unaffected by the inclusion of the constant |
1
.)
2 2
c
o o t
t
X
=
21
t t t
X X c | + + =
1 1


t t t
X t X c c c | + + + + + =
1 1 0 1
...
t X X E
t 1 0
) ( | + =
2 2
c
o o t
t
X
=
This process is known as a random walk with drift, the drift referring to the systematic
change in the expectation from one time period to the next.
22
The figure shows 50 realizations of such a process. The underlying drift line is highlighted
in yellow. It can be seen that the ensemble distribution changes in two ways with time.
-20
-10
0
10
20
30
40
50
0 10 20 30 40 50
( ) 1 , 0 ~ N
t
c
t t t
X X c + + =
1
5 . 0
23
The mean changes. In this case it is drifting upwards because |
1
has been taken to be
positive. If |
1
were negative, it would be drifting downwards.
-20
-10
0
10
20
30
40
50
0 10 20 30 40 50
( ) 1 , 0 ~ N
t
c
t t t
X X c + + =
1
5 . 0
24
And, as in the case of the random walk with no drift, the distribution spreads out around its
mean.
-20
-10
0
10
20
30
40
50
0 10 20 30 40 50
( ) 1 , 0 ~ N
t
c
t t t
X X c + + =
1
5 . 0
t t
t X c | | + + =
2 1
25
Random walks are not the only type of nonstationary process. Another common example of
a nonstationary time series is one possessing a time trend.

Deterministic trend

t t
t X c | | + + =
2 1
26
This type of trend is described as a deterministic trend, to differentiate it from the trend
found in a model of a random walk with drift.

Deterministic trend

t t
t X c | | + + =
2 1
27

Deterministic trend

( ) t X E
t 2 1
| | + =
It is nonstationary because the expected value of X
t
is not independent of t. Its population
variance is not even defined.
28
The figure shows 50 realizations of a variation where the disturbance term is the stationary
process u
t
= 0.8u
t1
+ c
t
. The underlying trend line is shown in white.
t t
u t X + + =
2 1
| |
t t t
u u c + =
1
8 . 0
-10
0
10
20
0 10 20 30 40 50
t t
t X c | | + + =
2 1
29
Superficially, this model looks similar to the random walk with drift, when the latter is
written in terms of its components from time 0.
t t t
X t X c c c | + + + + + =
1 1 0 1
...

Deterministic trend

30
The key difference between a deterministic trend and a random walk with drift is that in the
former, the series must keep coming back to a fixed trend line.
-10
0
10
20
0 10 20 30 40 50
t t
t X c | | + + =
2 1

t t t
X t X c c c | + + + + + =
1 1 0 1
...

Deterministic trend
-20
-10
0
10
20
30
40
50
0 10 20 30 40 50
31
In any given observation, Xt will be displaced from the trend line by an amount u
t
, but,
provided that this is stationary, it must otherwise adhere to the trend line.
-10
0
10
20
0 10 20 30 40 50
t t
t X c | | + + =
2 1

t t t
X t X c c c | + + + + + =
1 1 0 1
...

Deterministic trend
-20
-10
0
10
20
30
40
50
0 10 20 30 40 50
32
By contrast, in a random walk with drift, the displacement from the underlying trend line at
time t is the random walk . Since the displacement is a random walk, there is no reason
why X
t
should ever return to its trend line.
-10
0
10
20
0 10 20 30 40 50
t t
t X c | | + + =
2 1

t t t
X t X c c c | + + + + + =
1 1 0 1
...

Deterministic trend
-20
-10
0
10
20
30
40
50
0 10 20 30 40 50
33
It is important to make a distinction between the concepts of difference-stationarity and
trend-stationarity.

Difference-stationarity and trend-stationarity

t t t
X X c | + + =
1 1
34
If a nonstationary process can be transformed into a stationary process by differencing, it is
said to be difference-stationary. A random walk, with or without drift, is an example.

Difference-stationarity

t t t
X X c | + + =
1 1
35
The first difference, AX
t
, is simply equal to the sum of |
1
and c
t
.
t t t t
X X X c | + = = A
1 1


t t t
X X c | + + =
1 1
36
This is a stationary process with population mean |
1
and variance o
c
2
, both independent of
time. It is actually iid and the covariance between AX
t
and AX
t+s
is zero.
t t t t
X X X c | + = = A
1 1
( )
1
| = A
t
X E
2 2
c
o o =
A
t
X


( ) 0 , cov = A A
+s t t
X X
t t t
X X c | + + =
1 1
37
If a nonstationary time series can be transformed into a stationary process by differencing
once, as in this case, it is described as integrated of order 1, or I(1).
t t t t
X X X c | + = = A
1 1
( )
1
| = A
t
X E
2 2
c
o o =
A
t
X


( ) 0 , cov = A A
+s t t
X X
t t t
X X c | + + =
1 1
38
t t t t
X X X c | + = = A
1 1
( )
1
| = A
t
X E
2 2
c
o o =
A
t
X


( ) 0 , cov = A A
+s t t
X X
The reason that the series is described as 'integrated' is that the shock in each time period
is permanently incorporated in it. There is no tendency for the effects of the shocks to
attenuate with time, as in a stationary process or in a model with a deterministic trend.
t t t
X X c | + + =
1 1
39
If a series can be made stationary by differencing twice, it is known as I(2), and so on. To
complete the picture, a stationary process, which by definition needs no differencing, is
described as I(0). In practice most series are I(0), I(1), or, occasionally, I(2).
t t t t
X X X c | + = = A
1 1
( )
1
| = A
t
X E
2 2
c
o o =
A
t
X


( ) 0 , cov = A A
+s t t
X X
t t t
X X c | + + =
1 1
40
The stochastic component c
t
is iid. More generally, the stationary process reached after
differencing may be ARMA(p, q): auto-regressive of order p and moving average of order q.
t t t t
X X X c | + = = A
1 1
( )
1
| = A
t
X E
2 2
c
o o =
A
t
X


( ) 0 , cov = A A
+s t t
X X
t t t
X X c | + + =
1 1
41
The original series is then characterized as an ARIMA(p, d, q) time series, where d is the
number of times it has to be differenced to render it stationary.
t t t t
X X X c | + = = A
1 1
( )
1
| = A
t
X E
2 2
c
o o =
A
t
X


( ) 0 , cov = A A
+s t t
X X
42
A nonstationary time series is described as being trend-stationary if it can be transformed
into a stationary process by extracting a time trend.

Trend-stationarity

t t
t X c | | + + =
2 1
43
For example, the very simple model given by the first equation can be detrended by fitting it
(second equation) and defining a new variable with the third equation. The new, detrended,
variable is of course just the residuals from the regression of X on t.

Trend-stationarity

t t
t X c | | + + =
2 1
t b b X
t 2 1
+ =
t b b X X X X
t t t t 2 1
~
= =
44
The distinction between difference-stationarity and trend-stationarity is important for the
analysis of time series.

Trend-stationarity

t t
t X c | | + + =
2 1
t b b X
t 2 1
+ =
t b b X X X X
t t t t 2 1
~
= =
45
At one time it was conventional to assume that macroeconomic time series could be
decomposed into trend and cyclical components, the former being determined by real
factors, such as the growth of GDP, and the latter being determined by transitory factors,
such as monetary policy.

Trend-stationarity

t t
t X c | | + + =
2 1
t b b X
t 2 1
+ =
t b b X X X X
t t t t 2 1
~
= =
46
Typically the cyclical component was analyzed using detrended versions of the variables in
the model.

Trend-stationarity

t t
t X c | | + + =
2 1
t b b X
t 2 1
+ =
t b b X X X X
t t t t 2 1
~
= =
47
However, this approach is inappropriate if the process is difference-stationary. Although
detrending may remove any drift, it does not affect the increasing variance of the series,
and so the detrended component remains nonstationary.
t t
t X c | | + + =
2 1

Deterministic trend


t t t
X t X c c c | + + + + + =
1 1 0 1
...
48
As will be seen in the next slideshow, this gives rise to problems of estimation and
inference.

Deterministic trend


t t
t X c | | + + =
2 1
t t t
X t X c c c | + + + + + =
1 1 0 1
...
49
Further, because the approach ignores the contribution of real shocks to economic
fluctuations, it causes the role of transitory factors in the cycle to be overestimated.

Deterministic trend


t t
t X c | | + + =
2 1
t t t
X t X c c c | + + + + + =
1 1 0 1
...

Assumption C.5 States That The Values of The Disturbance Term in The Observations in The Sample Are Generated Independently of Each Other

Cargado por

Información del documento

Descripción original:

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Assumption C.5 States That The Values of The Disturbance Term in The Observations in The Sample Are Generated Independently of Each Other

Cargado por

Copyright:

Formatos disponibles

AUTOCORRELATION

The variance of each innovation is equal to o

También podría gustarte