Time
Series Prediction

The first step that is usually taken in time series prediction is to filter the data. Most data is somewhat noisy, or it contains individual data items which are accurate but not useful in making predictions. For example, if the Amazon.com website is down for an entire day, or hour, then sales will be zero during that period. But how useful is that information in predicting sales when the website comes back up? Probably not very useful at all. Another example would be a stock whose closing prices over five consecutive days are 21.00, 22.00, 230.00, 24.00 and 25.00. The third day's number, 230.00, is clearly way out of place in this time series, and probably represents a data entry error (someone typed in an extra zero). Again, some filtering is needed here. So some filtering involves cleaning bad data and other filtering involves removing unusual, although accurate, data. The next step, which is especially useful with sales data, is to remove seasonal variations. Sales for retail products, for example, are usually higher in December than in November, even in a slow year. One way to take care of this is to average data over the past year, so that instead of looking at just the data for Nov 2000, for example, we would average data from Dec 1999Nov 2000. We would do the same thing every month, so when we predicted performance for Dec 2000, we would actually be predicting the average monthly sales for the period from Jan 2000Dec 2000. Once we've forecast that, it is easy mathematically if we already know the sales through November to come up with a forecast for December. A more sophisticated approach involves noticing that a lot of time series involve a number of cyclical patterns. There may be long term cycles as well as short term cycles, and if we can ferret out which different cycles are in play, we can go a long way to predicting future values. As you may recall, a cyclical variable follows a pattern called a sine wave which looks like the following:
However, even after we've reduced the data to a set of sine waves, we may find that that is not sufficient to model the existing data and hence is not likely to be effective for forecasting. Another approach, called ARMA, may be used in addition to or instead of the Fourier transform. ARMA stands for "autoregressive moving average". The basic idea behind ARMA is that the value of the variable we are trying to forecast is a weighted average of the values at a number of previous time points plus a weighted average of the errors (or shocks) of the forecast at each of those previous time points. This average won't be perfectly accurate, of course, but the error or shock can then be used in future weighted averages and this will produce more accurate forecasts down the road. How do we determine the precise weightings to use on the averages? There are standard statistical regression techniques (that's why it's called "autoregressive") which can be used. Alternatively, if more sophistication is needed, a neural network approach is often used. Indeed, the same potential problem exists for time series prediction as with neural networks: overfitting the data so that the "forecast" would have worked well at all time points in the past but fails miserably in the future. As with neural nets, a certain degree of quality control is needed. More information
about time series forecasting can be found
Next Edition: Personalization


Home  
Home: Ramalila.NET

Legals: 