Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Complete each section. When you are ready, save your file as a PDF document and submit it
here: https://classroom.udacity.com/nanodegrees/nd008/parts/edd0e8e8-158f-4044-9468-
3e08fd08cbf8/project
Answer the following questions to help you plan out your analysis:
1. Does the dataset meet the criteria of a time series dataset? Make sure to explore all four
key characteristics of a time series data.
The data is a time series data because it has:
continuous time interval
sequential measurements across that interval : Awesome: Excellent!
equal spacing between every two consecutive measurements
each time unit within the time interval has at most one data point
: Required; Please note that we are forecasting for the
2. Which records should be used as the holdout sample? next 4 periods, therefore, we need 4 months as a
Because the manager want have a forecast of monthly sale data, it is better to keep the
most recent 12 monthly data as the hold out sample. holdout sample. Please make sure that these four
months are the last four months of the data set we
Step 2: Determine Trend, Seasonal, and Error components have - 2013-06 to 2013-09,
Graph the data set and decompose the time series into its three main components: trend,
seasonality, and error. (250 word limit)
1. What are the trend, seasonality, and error of the time series? Show how you were able
to determine the components using time series plots. Include the graphs.
: Required: Please note that the seasonal portion
shows that the regularly occurring spike in sales each
year changes in magnitude, even so slightly rather
than being constant. In Alteryx, we will need to hover
The trend of the booking has a linear shape, there is seasonality increasing year after our mouse over the seasonal graph in Interface mode
year but the magnitude is not big, therefore, it is more of additive than multiplicative. The to be able to see that the seasonal numbers are
error term display a multiplicative trend as the peak increase quite fast. slightly increasing. This is important because:
- Having seasonality suggests that any ARIMA models
Step 3: Build your Models used for analysis will need seasonal differencing.
- The change in magnitude suggests that any ETS
Analyze your graphs and determine the appropriate measurements to apply to your ARIMA and models will use a multiplicative method in the
ETS models and describe the errors for both models. (500 word limit) seasonal component.
The result bellow show the result from running the model. The data has the minimum : Awesome: The error and trend terms are correct - well
value of 51,000 and a mean of more than 276,000; comparing with the value of all error done!
of about 30,000, the model is acceptable. However, 30,000 is still a very big number, so
the model needed to be rechecked again by comparing with another model. AIC and BIC : : Required: Since the interpretation of the
value is also quite small. But still we need to compare it with another model to get a decomposition plot above is not accurate as I
more credible result. Besides, if we use the graph to forecast the nearest 12 month sale, mentioned above the terms for the EST model are not
the result follows the actual sale pattern quite well, but the amount is all smaller than the accurately identified. Specifically, the method for
actual value. seasonality is not correct.
2. What are the model terms for ARIMA? Explain why you chose those terms. Graph the In the order listed. These plots need to be used to
Auto-Correlation Function (ACF) and Partial Autocorrelation Function Plots (PACF) for support your choice of the terms of the ARIMA model.
the time series and seasonal component and use these graphs to justify choosing your
model terms.
a. Describe the in-sample errors. Use at least RMSE and MASE when examining Then after establishing the correct ARIMA model we
results should regraph ACF and PACF for both the Time Series
b. Regraph ACF and PACF for both the Time Series and Seasonal Difference and and Seasonal Difference and include these graphs in
include these graphs in your answer our
answer. The ACF and PACF results for the correct
ARIMA model should show no significantly correlated
lags suggesting no need for adding additional AR() or
MA() terms.
The result from using ARIMA to forecast the holdout sample seems very promising, as
the predicted value follows closely the real ones.
Step 4: Forecast
Compare the in-sample error measurements to both models and compare error measurements
for the holdout sample in your forecast. Choose the best fitting model and forecast the next four
periods. (250 words limit)
1. Which model did you choose? Justify your answer by showing: in-sample error
measurements and forecast error measurements against the holdout sample.
Two tables bellows shows the forecast error from the ETS (the first table) and the
ARIMA (the second one). In the previous part, it has been shown that the ETS and
ARIMA performance in the training data is difficult to determine which one is better as
ETS has good performance in the RMSE while ARIMA is good in the AIC and BIC ratio.
However, in the holdout sample, the situation is different, ARIMA outperform in all ratio.
Therefore, the good performance of ETS might be caused by overfitting and we should : Awesome: ARIMA is the best performing model - well
choose the ARIMA model to forecast sale. done!
Please check your answers against the requirements of the project dictated by the rubric here.
Reviewers will use this rubric to grade your project.