Está en la página 1de 22

First steps in time series

Time series look different


300000 250000 200000
800000000 1400000000 1200000000 1000000000

150000 100000 50000 0 11 16 21 26 31 36 41 46 51 56 1 6

600000000 400000000 200000000 0 1940

1950

1960

1970

1980

1990

2000

2010

But they are not

Xt =

systematic pattern +

noise

Obscures the pattern Often made with two basic components

trend + seasonality

periodic Underlying linear/non-linear, time changing, uncanny model

300000 250000 200000 150000 100000 50000 0 oct-95

mar-97

jul-98

dic-99

abr-01

sep-02

Monthly car production in Spain

Trend analysis Smooth data using an underlying model Moving average

1 t ma (t , m) = xi m i =t m +1
Exponential moving average

st = xt + (1 ) st 1
Good for one-period ahead forecasting (weather)

Fit any function


linear, splines, log, exp, your guess

linear

Detect and model trend


ma(12)

100000 50000 0 1 5 9 13 17 21 25 29 33 37 41 -50000 -100000 -150000 -200000 45

Substract trend

Stationarity
(mean and variance)

Seasonality Look for autocorrelations absolute values

xt , xt 1 , xt 2 , xt 3 ,...
increments

xt xt 1 , xt 1 xt 2 , ...
other combination of variables! (intuition+expertise)

lag 1 lag 2 lag 3 lag 4 lag 5 lag 6 lag 7 lag 8 lag 9 lag 10 lag 11 lag 12 lag 13 lag 14 lag 15 lag 16 lag 17 lag 18 lag 19 lag 20 lag 21 lag 22 lag 23 lag 24

npat 58 npat 57 npat 56 npat 55 npat 54 npat 53 npat 52 npat 51 npat 50 npat 49 npat 48 npat 47 npat 46 npat 45 npat 44 npat 43 npat 42 npat 41 npat 40 npat 39 npat 38 npat 37 npat 36 npat 35

corr -0.0397427758 corr -0.274627552 corr -0.265544135 corr 0.20316958 corr -0.0689740424 corr -0.0261346967 corr -0.106164043 corr 0.242284075 corr -0.245981648 corr -0.315532046 corr -0.0585207867 corr 0.94479132 corr -0.0744984938 corr -0.255095906 corr -0.272366416 corr 0.197473796 corr 2.4789972E-05 corr -0.0635903421 corr -0.138785333 corr 0.311166354 corr -0.224766545 corr -0.337246239 corr -0.056239266 corr 0.916841357

xt vs

xt -lag

Structure seasonality

anticorrelations

xt xt 1 xt 1
lag 1 lag 2 lag 3 lag 4 lag 5 lag 6 lag 7 lag 8 lag 9 lag 10 lag 11 lag 12 lag 13 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 inc 1 npat 57 npat 56 npat 55 npat 54 npat 53 npat 52 npat 51 npat 50 npat 49 npat 48 npat 47 npat 46 npat 45

vs

xt lag xt lag 1 xt lag 1

corr -0.390023157 corr -0.101618385 corr -0.184521702 corr 0.275796168 corr -0.11691902 corr 0.0605651902 corr -0.146978369 corr 0.309853501 corr -0.18309118 corr -0.127671412 corr -0.35079422 corr 0.944415972 corr -0.390599355

12-month correlation!

xt xt 2 xt 2
lag 1 lag 2 lag 3 lag 4 lag 5 lag 6 lag 7 lag 8 lag 9 lag 10 lag 11 lag 12 lag 13 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 inc 2 npat 56 npat 55 npat 54 npat 53 npat 52 npat 51 npat 50 npat 49 npat 48 npat 47 npat 46 npat 45 npat 44

vs

xt lag xt lag 2 xt lag 2

corr 0.0635841158 corr -0.661845284 corr -0.14900299 corr 0.236847899 corr 0.0775265432 corr -0.1446445 corr 0.0427922726 corr 0.268605346 corr -0.13164323 corr -0.660207622 corr 0.0676404933 corr 0.956921988 corr 0.0444132255

2-month anticorrelation!

Danger: correlations vs error

const

Fourier analysis uncovers periodicity

2 /48=.131 Single tick = 2

More sophisticated analysis is possible but brings little further information

Reasonable bets:

xt = f ( xt 12 )

xt 1 xt 2 xt 1 xt 13 xt xt 1 = f , xt 1 x x t 2 t 13 xt 2 xt 4 xt 2 xt 14 xt xt 2 = f , xt 2 x x t 2 t 14

Best linear fit + neural fit to 12 based on the last 12 months

300000 250000 200000 150000 100000 50000 0 13 17 21 25 29 33 37 41 45 1 5 9 Goal Linear Neural

Magnification of the last 12 months (8 train patterns + 4 predictions)

300000 250000 200000 150000 100000 50000 0 1 2 3 4 5 6 7 8 9 10 11 12 Goal Linear Neural

prediction

Linear fit heavily depends on one variable Neural net finds non-linear relations that enhance correlations

Simple is good
If in doubt, start with a simple dependency Ex:

xt = xt-lag lag = 1 day


Weather forecast Donuts

lag = 7 days
Donuts Electricity load curve

lag = 1 year
Electricity load curve Sales

ARIMA
Define Backward shift Backward difference Polynomials of degree p and q

Bxt = xt 1 xt = xt xt 1 = (1 B ) xt
p ( B) , q ( B)

Auto-Regressive Moving Average time series models

p ( B ) d xt = q ( B ) t

p ( B ) d xt = q ( B ) t
p ( B ) = 1 1 B 2 B 2 ... p B p

AR(p) I(d)

2 q ( B ) = 1 B B ... B MA(q) q 1 2 q

Zeros in polynomials must lie outside unit circle

Example: ARIMA(1,0,0)

xt = 1 xt 1 + t

NN enhancement
Rather than using recursive NN, carry out linear analysis preprocess data to ARIMA like
perform a linear forecast

feed a NN with all the linear analysis preprocessed data


linear prediction

The NN will learn (if any) the underlying law controlling the departures of real data from linear analysis

Leave-one-out NN
When the number of data are very small,
Vars + Goal = Pattern 1 Vars + Goal = Pattern 2 train Vars + Goal = Pattern 3 Vars + Goal = Pattern 4 Vars + Goal = Pattern 5 Vars + Goal = Pattern 6 leave out step 1 leave out train

step n

Collect statistics

Predict 1

Predict n

About noise
We have discarded noise Be careful
asset1 asset2 = = trend1 + noise1 trend2 + noise2

finances

correlations

dsti = i dt + dWt i st
Cov(i,j) 3000*3000 Huge CPU time VaR Brownian motion: N(0,t)

Summary
Time series are often structured Analyze trend + seasonality + noise Build a linear model with preprocessed data Build NN on top of previous analysis

También podría gustarte