Time Series Prediction

Time series prediction is a specialized, but very useful, area of artificial intelligence. The idea behind time series prediction is that we are given a series of data points, representing the sequential values of something that we want to predict, and we need to predict the next value. For example, we might be given the closing prices for a given stock over the last month or year, and be asked to predict the closing price tomorrow. Or we might be given tick data--the last 100 moves intraday moves in the stock price--and be asked to predict when and to what price the stock will next move. Or we may be given sales data for a given product for each month over the last few years, and want to predict this month's sales.

The first step that is usually taken in time series prediction is to filter the data. Most data is somewhat noisy, or it contains individual data items which are accurate but not useful in making predictions. For example, if the Amazon.com website is down for an entire day, or hour, then sales will be zero during that period. But how useful is that information in predicting sales when the website comes back up? Probably not very useful at all. Another example would be a stock whose closing prices over five consecutive days are 21.00, 22.00, 230.00, 24.00 and 25.00. The third day's number, 230.00, is clearly way out of place in this time series, and probably represents a data entry error (someone typed in an extra zero). Again, some filtering is needed here. So some filtering involves cleaning bad data and other filtering involves removing unusual, although accurate, data.

The next step, which is especially useful with sales data, is to remove seasonal variations. Sales for retail products, for example, are usually higher in December than in November, even in a slow year. One way to take care of this is to average data over the past year, so that instead of looking at just the data for Nov 2000, for example, we would average data from Dec 1999-Nov 2000. We would do the same thing every month, so when we predicted performance for Dec 2000, we would actually be predicting the average monthly sales for the period from Jan 2000-Dec 2000. Once we've forecast that, it is easy mathematically if we already know the sales through November to come up with a forecast for December.

A more sophisticated approach involves noticing that a lot of time series involve a number of cyclical patterns. There may be long term cycles as well as short term cycles, and if we can ferret out which different cycles are in play, we can go a long way to predicting future values. As you may recall, a cyclical variable follows a pattern called a sine wave which looks like the following:

Different sine waves may have different frequencies. Here is a sine wave with a higher frequency:

The insight here is that most data contains a variety of cyclical patterns of varying frequencies. So most data can be forecast, at least partially, by representing it as the combination of a set of sine waves of differing frequencies. A mathematical technique called the Fourier transform is very effective at taking a series of data and computing the different sine waves that go to make up that data.

However, even after we've reduced the data to a set of sine waves, we may find that that is not sufficient to model the existing data and hence is not likely to be effective for forecasting. Another approach, called ARMA, may be used in addition to or instead of the Fourier transform. ARMA stands for "autoregressive moving average". The basic idea behind ARMA is that the value of the variable we are trying to forecast is a weighted average of the values at a number of previous time points plus a weighted average of the errors (or shocks) of the forecast at each of those previous time points. This average won't be perfectly accurate, of course, but the error or shock can then be used in future weighted averages and this will produce more accurate forecasts down the road. How do we determine the precise weightings to use on the averages? There are standard statistical regression techniques (that's why it's called "autoregressive") which can be used. Alternatively, if more sophistication is needed, a neural network approach is often used.

Indeed, the same potential problem exists for time series prediction as with neural networks: overfitting the data so that the "forecast" would have worked well at all time points in the past but fails miserably in the future. As with neural nets, a certain degree of quality control is needed.

More information about time series forecasting can be found
in either of the following two books:
Time Series Prediction and
Neural, Novel & Hybrid Algorithms for Time Series Prediction



Next Edition: Personalization



Home: Ramalila.NET



All copyrights are maintained by respective contributors and may not be reused without permission. Graphics and scripts may not be directly linked to. Site assets copyright © 2000 RamaLila.com and respective authors.
By using this site, you agree to relinquish all liabilities and claims financial or otherwise against RamaLila and its contributors. Visit this site at your own risk.