Está en la página 1de 6

AADECA 2012 Semana del Control Automtico 23 Congreso Argentino de Control Automtico

3 al 5 de Octubre de 2012 Buenos Aires, Argentina.









NEURAL NETWORKS BASED ALGORITHMS FOR FORECASTING SHORT TIME SERIES WITH
UNCERTATINTIES ON THEIR DATA

Cristian Rodrguez Rivero, Julin Pucheta, Josef Baumgartner, Martn Herrera, Victor Sauchelli, H. D
Patio and Sergio Laboret

Departments of Electrical and Electronic Engineering, Mathematics Research Laboratory applied to control
(LIMAC), at Faculty of Exact, Physical and Natural Sciences Universidad Nacional de Crdoba, Crdoba,
Argentina. cristian.rodriguezrivero@gmail.com, julian.pucheta@gmail.com,
josef.s.baumgartner@gmail.com,victorsauchelli@gmail.com, slaboret@yahoo.com.ar
Departments of Electrical Engineering, at Faculty of Sciences and Applied Technologies - National University
of Catamarca, Catamarca, Argentina.
martinictohache@gmail.com
Institute of Automatics INAUT Faculty of Engineering-
National University of San Juan, San Juan, Argentina
dpatino@unsj.edu.ar



Abstract: This article presents a methodology to forecast time series, which have uncertainties in its
information. This series represents the monthly cumulative rainfall and this data is manually collected by the
agriculture producer, which is used to generate the future scenario for decision-making in La Sevillana
establishment, Balnearia, Cordoba, Argentina. So, in this work the next 15 time series values concerning with
the predictor system is presented by short cumulative monthly historical rainfall and solutions of the Mackay-
Glass differential equation using kernel and artificial feed-forward neural networks (ANNs). On the one hand,
the methodology proposed consist of predicting this values one step ahead taking into account that the problem
of using non-parametric ANN is such that there is a lack of data in the original dataset, so a method is proposed
for smoothing this empty values in order to complete the dataset. On the other hand, the completion of this use
an average approach, so the results employing kernel networks based on NSGA-II genetic algorithm and ANN
shows that the contribution to predict with incomplete time series has an adequate performance in the forecasting
problem implementing short time series. The Computational algorithm simulations and numerical results reveal
that the proposed technique presents a satisfactory outcome to forecast short time series for decision-making
when the observations are taken from a single geographical point as proposed here, respectively. These results
were given to the producer in order to decide whether to plant one crop or another, according to its desired
profitability.



Keywords: Artificial Neural networks, Gaussian Process, rainfall time series, Mackay-Glass, Hursts
parameter.


1. INTRODUCTION

Nowadays, the problem of prediction becomes a
crucial area in several branches of the sciences and
moreover, the meteorological variables are one of the
most challenging topics where this article intends to
aim some issues useful for control problems in
agricultural activities. The availability of estimated
scenarios for water predictability attempts the
producer whether to decide planting in a certain time
or another (Pucheta et al., 2011). Some approaches
based on artificial neural network (ANN) including
machine learning were introduced in related work
before. The rainfall forecast problem (Rodriguez
Rivero et al., 2011; Baumgartner et al., 2011) in
some geographical points of Cordoba, the energy
demand purposes (Chow et al., 1996), the guidance
of seedling growth, involves that the data must be
mathematically modeled, so in this paper those are
taken from the Mackay Glass benchmark equation
and cumulative historical rainfall whose forecast is
simulated by a Monte Carlo approach employing
ANN and genetic algorithm based on NSGA-II
(Baumgartner et al., 2010; Bishop, 2006). The main
contribution here is the design of a forecast system
that uses incomplete data sets for tuning its
AADECA 2012 Semana del Control Automtico 23 Congreso Argentino de Control Automtico
3 al 5 de Octubre de 2012 Buenos Aires, Argentina.

parameters at the same time that the historical
recorded data is relatively short. The filter parameter
is put in function of the roughness of the short time
series, between its smoothness. In addition, this
forecasting tool is intended to be used by agricultural
producers to maximize their profits, avowing profit
losses over the misjudgment of future movements to
maximize their utilities. A one-layered feed-forward
neural network, trained by the Levenberg-Marquardt
algorithm and the usage of GP algorithm trained by
the NSGA-II, are run for prediction. We evaluate the
accuracy of the forecasting attained with the
proposed method reporting the results from La
Sevillana rainfall time series and solutions of MG
equations.

2. CASE STUDIES

The proposed work is applied to forecast next-15
values in the field of meteorological variables such as
cumulative rainfall time series (Pucheta et al.,2009),
in this case from La Sevillana, Cordoba, Argentina
and series from solutions of MG equation. The
forecasting technique is computed using historical
data of year 2004 to 2011 of La Sevillana
establishment.

1.1 La Sevillana Rainfall time series

The first series used is incomplete containing 79 of
cumulative monthly rainfall data, where there is a
lack of 5 months without information resulting in a
non-determinist series whose behavior is hard to
predict because of seasonality is not well-determined
cause by few data. For the sake of making a fair
prediction, empty values where replaced by using a
method consisting of averaging the prior and
posterior value in order to complete the lack values.
To build the forecasting model for each of the
considered values, the information available includes
monthly historical rainfall.

1.2 Mackay-Glass time series

The second one is obtained from de the solution of
the MG equation. This equation serves to model
natural phenomena and has been used in earlier work
to implement a comparison of different methods
employed to make forecast (Adriana and Espinoza,
2004). Here one of the proposed algorithms to predict
values of time series are taken from the solution of
the MG equation (Glass and Mackey, 1988), which is
explained by the time delay differential equation
defined as

(1)


where =0.2, =0.85, and c=10 are parameters and
=100 is the delay time. According as increases,
the solution turns from periodic to chaotic. Thereby,
a time series with a random-like behavior is obtained,
and the long-term behavior changes thoroughly by
changing the initial conditions to obtain the stochastic
dependence of the deterministic time series according
to its roughness.
In this work the Hursts parameter is used in the
learning process to modify on-line the number of
patterns, the number of iterations, and the number of
filters inputs of the ANN. This H serves to have an
idea of roughness of a signal (Dieker, 2004) and the
time series are considered as a trace of an fBm
depending on the so-called Hurst parameter 0<H<1.
The benchmark chosen for one solution is called
MG05 in the forecasting. By contrast, the GP filter
does not contemplate this hypothesis, instead it
modifies the covariance functions in order to achieve
the best optimization of the population results.

3. PROBLEM FORMULATION

The main issue when forecasting a time series is how
to retrieve the maximum of information from the
available data. In this case, the lack of data in the
dataset is taken into account in order to predict one
step ahead for both filter based on ANN and GP. It is
proposed to fill these empty values by using prior and
posterior data.

Fig. 1 Cumulative Monthly Rainfall of La Sevillana
measured in mm H
2
O.

Four dataset are built following Fig. 1. In the first
one, the lack data is completed by taking the same
ensemble of data of the past year. The second one by
using the same ensemble of the next year, the third
one is completed with zeros and lastly is filled in by
averaging the prior and posterior year. The same
analogy is used to construct MG05 dataset solution of
equation (1). The prediction system is implemented
by using either an ANN model or a GP adaptive
filter. For the GP filter, depending on its covariance
function a certain number of hyperparameters are
available for tuning. Moreover, the inputs quantity of
the GP filter needs to be determine, keeping in mind
that the given time series might have long or short
term dependences. In contrast, the coefficients of the
ANNs filter are adjusted on-line in the learning
process, by considering a criterion that modifies at
each pass of the time series the number of patterns,

Ja feb mar ap m ju jul au se oc no dic Annual
2004

67 163 119

2005 119 142 242 47 0 0 15 52 2 57 80 77 833
2006 101 139 110 72 0 0 0 3 0 58 140 150 773
2007 45 52 348 30 10 0 0 0 24 6 50 32 597
2008 163 247 74
x x x x x
20 52 70 40 666
2009 42 75 259 0 0 22 10 21 25 37 267 202 960
2010 137 128 177 20 61 0 0 8 40 64 35 103 773
2011 243 59 63 65

430
Average 121 120 182 39 14 4 5 17 19 49 115 103 788
) (
) ( 1
) (
) ( t y
t y
t y
t y
c
|
t
t o

+

=
-
AADECA 2012 Semana del Control Automtico 23 Congreso Argentino de Control Automtico
3 al 5 de Octubre de 2012 Buenos Aires, Argentina.

the number of iterations and the length of the tapped-
delay line, in function of the Hursts value (H)
calculated from the time series according to the
stochastic behavior of the series, respectively.
In this work, kernels and ANN are used. The present
value of the time series is used as the desired
response for the adaptive filter and the past values of
the signal serve as input of the adaptive filter. Then,
the adaptive filter output will be the one-step
prediction signal. In the block diagram of the
nonlinear prediction scheme based on a NN filter is
shown. Here, a prediction device is designed such
that starting from a given sequence {x
n
} at time n
corresponding to a time series it can be obtained the
best prediction {x
e
} for the following sequence of 18
values. Hence, it is proposed a predictor filter with an
input vector lx, which is obtained by applying the
delay operator, Z
-1
, to the sequence {x
n
}. Then, the
filter output will generate x
e
as the next value, that
will be equal to the present value x
n
. So, the
prediction error at time k can be evaluated as
(2)

which is used for the learning rule to adjust the NN
weights.

3. PROPOSED APPROACH

3.1 Implementation of ANN

Some results had been obtained from ANN approach,
which are detailed on (Pucheta et al, 2009) and
kernel (Baumgartner et al, 2010). These results were
promising and deserve to be improved by more
sophisticated filters (Haykin, 1999; Zhang et al,
1998). The NN used is a time lagged feed-forward
networks type. The NN topology consists of l
x
inputs,
one hidden layer of H
o
neurons, and one output
neuron. The learning rule used in the learning process
is based on the Levenberg-Marquardt method
(Bishop, 1995). However, if the time series is smooth
or rough then the tuning algorithm may change in
order to fit the time series. So, the learning rule
modifies the number of patterns and the number of
iterations at each time-stage according to the Hursts
parameter H, which gives short or long term
dependence of the sequence {x
n
}. From a practical
standpoint, it gives the roughness of the time series.

3.2 Implementation of kernel Filters

Kernel filters are evaluated by the NSGA-II and
consists of a certain time lags and a covariance
function. Thus the training inputs of the GP have to
be constructed from the given data according to the
time lags of the individual before tuning the filter
model quoted in
In the case of a GP, the model is tuned by varying
the hyperparameters (Rasmussen et al., 2006) of the
covariance function. Depending on the covariance
function there are several hyperparameters that need
to be adjusted to suit the training data. In other words
one is interested in finding a maximum of the log
marginal likelihood. Without going into detail the
framework presented in is used to optimize the
hyperparameters. Once they are found, the training
process is finished.
To evaluate a GP model one has to calculate the
covariance matrix K and its inverse K
-1
. For n given
training points K has size (nxn). Their entries are the
pairwise covariance of the training inputs which
makes K a symmetric matrix. Supposing that the
variables have a joint Gaussian distribution with zero
mean, the mean prediction for an unknown input f
*
is
given by
(3)

where X
*
is the unknown input, X are the training
inputs and f are the training outputs. If the mean of
the data is not zero, it can be transformed
straightforward to fit the conditions.

3.3 Performance measure for forecasting

In order to test the proposed design for forecasting
short time series with uncertainties on their data, an
experiment with time series obtained from the MG
solution and monthly cumulative rainfall was
performed. The performance of both filter, the ANN
and kernel, is evaluated using the Symmetric Mean
Absolute Percent Error (SMAPE) proposed in the
most of metric evaluation, defined by

(4)

where t is the observation time, n is the size of the
test set, s is each time series, X
t
and F
t
are the actual
and the forecasted time series values at time t
respectively. The SMAPE of each series s calculates
the symmetric absolute error in percent between the
actual X
t
and its corresponding forecast value F
t
,
across all observations t of the test set of size n for
each time series s.

4. MAIN RESULTS

4.1 Prediction Results of the methodology proposed

Each time series is composed by using La Sevillana
rainfall series and the MG05. However, there are
three classes of data sets: one is the original time
series used for both algorithms in order to give the
forecast, which comprises 79 values. The others one
are obtained by fulfilling the lack values of original
time series by averaging used to compare if the
forecast is acceptable or not where the 15 last values
can be used to validate the performance of the
prediction system, which 64 values form the data set,
and 79 values constitute the Forecasted and the Real
ones. A comparison is made between both filters.

( ) ( ) ( ) k x k x k e
e n
=
( )
100
2
1
1

+

=

=
n
t t t
t t
S
F X
F X
n
SMAPE
( ) ( ) f X , X K X , X K f
1 * *
=
AADECA 2012 Semana del Control Automtico 23 Congreso Argentino de Control Automtico
3 al 5 de Octubre de 2012 Buenos Aires, Argentina.



Fig. 2. Forecast La Sevillana rainfall using kernel.



Fig. 3. Forecast MG05 using kernel.

Fig. 4. ANN H independent algorithm for MG05.

Fig. 5. ANN H independent algorithm for the
forecasted portion of MG05.



Fig. 6. ANN H independent algorithm for La
Sevillana rainfall series.

Fig. 7. The forecasted portion of La Sevillana rainfall
series using ANN.

4.2 Comparative Results

The performance of the ANN and kernel algorithms
for forecasting short time series with uncertainties on
their data is evaluated through the SMAPE index in
equation (4), shown in Table and table 2 over MG05
and La Sevillana rainfall series.



AADECA 2012 Semana del Control Automtico 23 Congreso Argentino de Control Automtico
3 al 5 de Octubre de 2012 Buenos Aires, Argentina.

Table 1. Figures obtained by kernel algorithm

Series
No.
Real
mean
Mean
Forecasted
SMAPE
Fig.2 2.77 2.85 96.43
Fig.3 71.03 101.68 110.25

Table 2. Figures obtained by ANN algorithm

Series
No.
H H
e

Real
mean
Mean
Forecasted
SMAPE
Fig.4 0.2126 0.118 2.77 2.76 81.26
Fig.6 0.2856 0.277 71.03 93.68 98.25

The comparison between both approaches indicates
that the there is a good performance shown by ANN
against kernel filter to forecasted short time series
with uncertainty in dataset. In addition, kernels filters
demonstrate to be more accurate in order to predict
time series such as MG05 rather than ANN. the
results of the SMAPE obtained by the proposed
filters are shown in Fig. 2 to Fig. 7. Here is only
contemplated the real and forecasted mean by using a
dataset, in which the average of the interpolations of
the prior and posterior data was used. It can be noted
that both filter outperform a similar performance to
predict short time series, but the ANN is closer the
Real media than kernel when rainfall series is used.

5. DISCUSSION

The assessment of the obtained results resides in the
lack of data in order to make the prediction. Once the
dataset is linearly interloped, both filter the ANN and
kernel perform different behaviors using MG05 and
rainfall series. In the two-analyzed cases, the
generation of 15 future values from 79 present values
was made by each filter. The same initial parameters
were used for each algorithm; the coefficients and the
structure of the filter are tuned by considering their
stochastic dependency of the short time series. Note
that the forecasts improvement is not over any given
time series, which results from the use of a short term
stochastic characteristic for generates a deterministic
result, such as a prediction. So, another technique
could be performed to interpolate when there is a
lack of data in short time series.

6. CONCLUSIONS

In this work neural networks based algorithms for
forecasting short time series with uncertainties on
their data have been presented. One of them uses a
ANN algorithm based on a heuristic law to tune its
parameters. The other one, gives exact interpolation
due to the nature of the kernel NN amd uses a genetic
algorithm to determine the filter parameters. The
learning rule proposed to adjust the NNs weights is
based on the Levenberg-Marquardt method.
Furthermore, in function of the short term stochastic
dependence of the time series, an on-line heuristic
adaptive law was set to update the ANNs topology in
order to forecast the next 15 values taking into
account that the results of the forecasted series was
originally interpolated by completing and averaging
with prior and posterior data following a proposed
linear method to fill the lack of data in the real series.
The main result shows a good performance of the
ANNs predictor system against GPs applied to short
time series forecasting when the observations are
taken from a single point, due to similar roughness
for both the original and the forecasted time series,
evaluated by H and H
e
respectively. These results
encourage us to continue working with this new
learning algorithm, applying to other NN models, to
generate the future scenario for decision-making for
agriculture producers.

6.1 Acknowledgments

This work was supported by Universidad Nacional de
Crdoba (UNC), FONCYT-PDFT PRH N3 (UNC
Program RRHH03), SECYT-UNC, National
University of Catamarca, Institute of Automatic
(INAUT) National University of San Juan and
National Agency for Scientific and Technological
Promotion (ANPCyT).

REFERENCES

J. Pucheta, M., C. Rodrguez Rivero, M. Herrera, C.
Salas, D. Patio and B. Kuchen. A Feed-forward
Neural Networks-Based Nonlinear
Autoregressive Model for Forecasting Time
Series. Revista Computacin y Sistemas, Centro
de Investigacin en Computacin-IPN, Mxico
D.F., Mxico, Computacin y Sistemas Vol. 14
No. 4, pp. 423-435 ISSN 1405-5546, 2011.
http://www.cic.ipn.mx/sitioCIC/images/revista/v
ol14-4/art07.pdf
C. Rodrguez Rivero, J. Pucheta, J. Baumgartner, M.
Herrera, C. Salas y V. Sauchelli. Modelado
bayesiano de un filtro autorregresivo no lineal
basado en redes neuronales para el pronstico de
series temporales de lluvia acumulada mensual,
XXIII Congreso Nacional del Agua, Conagua
2011, del 22 al 25 de Junio de 2011, Resistencia,
Chaco, Argentina. (2011). ISSN 1853-7685. Pp
149-163.
http://www.conagua2011.com.ar/dsite/actas/Hidr
ologia/Hidrologia2.pdf
Josef Baumgartner, Cristian Rodrguez Rivero,
Julin Pucheta. Pronstico de lluvia en un punto
desde diversos puntos geogrficos de observacin
mediante Procesos Gaussianos. XXIII Congreso
AADECA 2012 Semana del Control Automtico 23 Congreso Argentino de Control Automtico
3 al 5 de Octubre de 2012 Buenos Aires, Argentina.

Nacional del Agua, Conagua 2011, del 22 al 25
de Junio de 2011, Resistencia, Chaco, Argentina.
(2011). ISSN 1853-7685,
http://www.conagua2011.com.ar/dsite/actas/Hidr
ometeorologia/176.pdf
C. Rivero Rodrguez, J. Pucheta, J. Baumgartner,
H.D. Patio and B. Kuchen, An Approach for
Time Series Forecasting by simulating Stochastic
Processes Through Time-Lagged feed-forward
neural network. The 2010 World Congress in
Computer Science, Computer Engineering, and
Applied computing. Las Vegas, Nevada, USA,
July 12-15, 2010. DMIN10 Proceedings ISBN
1-60132-138-4 CSREA Press,p.p 278, (CD ISBN
1-60132-131-7), USA, (2010).
Chow, T.W.S.; Leung, C.T. Neural network based
short-term load forecasting using weather
compensation. Power Systems, IEEE
Transactions on, Vol.11, Iss.4, Nov 1996, Pp.
1736-1742.
Baumgartner, J., Cristian Rodrguez Rivero, Julin
Pucheta, A Genetic Algorithm based design
Approach for the Properties of a Gaussian
Process for Time series Forecasting, AADECA
2010, XXII Congreso Argentino de Control
Automtico, Buenos Aires, Argentina, (2010).
Bishop, C. Pattern Recognition and Machine
Learning. Springer. Boston, 2006.
Pucheta, J., Patino, D. and Kuchen, B. A
Statistically Dependent Approach For The
Monthly Rainfall Forecast from One Point
Observations. In Book Series Title: IFIP
Advances in Information and Communication
Technology, Book Title: Computer and
Computing Technologies in Agriculture II,
Volume 2, IFIP International Federation for
Information Processing Volume 294, Computer
and Computing Technologies in Agriculture II,
Volume 2, eds. D. Li, Z. Chunjiang, (Boston:
Springer), ISBN: 978-1-4419-0210-8, pp. 787
798, Url: http://dx.doi.org/10.1007/978-1-4419-
0211-5_1, Doi: 10.1007/978-1-4419-0211-5_1.
(2009).
Espinoza Contreras, Adriana Eliza. El Caos y la
caracterizacin de series de tiempo a travs de
tcnicas de la dinmica no lineal. Universidad
Autnoma de Mexico. Campus Aragn. 2004.
Glass L. and M. C. Mackey. From Clocks to Chaos,
The Rhythms of Life. Princeton University Press,
Princeton, NJ, 1988.
Dieker, T. Simulation of fractional Brownian
motion. MSc theses, University of Twente,
Amsterdam, The Netherlands. 2004.
Haykin, S. Neural Networks: A comprehensive
Foudation. 2nd Edition, Prentice Hall. 1999.
Zhang, G.; B.E. Patuwo, and M. Y. Hu. Forecasting
with artificial neural networks: The state of art.
J. Int. Forecasting, vol. 14, pp. 35-62. 1998.
Bishop, C. Neural Networks for Pattern Recognition.
University Press. Oxford, 1995.

Rasmussen, C. E. and Williams C. K. I. 2006.
Gaussian Processes for Machine Learning. The
MIT Press.