Focus on Forecasting: Forecasting Software Survey
So what can you predict for me? How to find
the product that’s right for you.
By Jack Yurkiewicz
We can forecast this: forecasting will continue to be in the news. The U.S. government gives the current unemployment figures, and then we hear pundits and news sources predict what those figures will be next quarter and next year. We read about the Congressional Budget Office’s projections  for the country’s deficits (see Figures 1 and 2) through 2019 while speakers and articles quote and then extol or deride the projections . The CBO’s opening sentence, “Since the Congressional Budget Office last issued its baseline projections … the outlook for the budget deficit has deteriorated further,” gives one pause. The projections made just a few months earlier for what will happen in a decade were wrong and had to be updated. In other words, as more data comes in, the forecasts may change.
|Forecasting directory and product listings|
|For a directory of the forecasting software products and their capabilities referenced in this article, go to: http://lionhrtpub.com/orms/surveys/fss/fss-fr.html|
Economic forecasts can be dicey, especially when the forecasts may themselves affect the future. With the health care reform package now law, will there be enough doctors to serve the larger insured base? Of lesser importance, how many movies will be filmed in 3D this and next year and how many screens in 2011-12 will be equipped to show these films? Everyone can add his or her own examples to the ones mentioned. What is common to many of these examples is that they fall into the forecasting category frequently called trend analysis.
Most management science, data analysis and operations management texts include at least one forecasting chapter. Many business schools’ curricula indicate that they cover some forecasting in at least one mandatory graduate or undergraduate quantitative course. The level of coverage varies. Generally, if computer software is not used in a text or course, the forecasting techniques discussed are those in which the math is relatively easy to demonstrate (e.g., simple exponential smoothing, moving averages, etc.). Trend analysis, which frequently involves regression models, may be limited to linear trend, and if the trend appears to be logarithmic, exponential or a simple S-curve, transformations are made to utilize linear regression. Even if the course or text does use the computer, but the accompanying software used does not do the more complex forecasting procedures (e.g., Box-Jenkins methods), then these techniques are not covered or barely delved. This is changing. As more commercial statistical vendors are making student versions available for student budgets, these texts and courses are, could and should cover more powerful forecasting procedures. Some examples of full-featured statistical analysis programs having Box-Jenkins and exponential smoothing forecasting capabilities that are available in academic versions are SPSS, SAS OnDemand, Minitab, Statgraphics, NCSS and Systat. Generally, dedicated forecasting (software that only does forecasting) products (e.g., AutoBox and Forecast Pro) do not offer student versions.
What To Look For In The Software
Forecasting software can fall into one of three categories. Automatic forecasting software will quickly do an analysis of the data and then make the forecasts using a methodology that it deemed the most appropriate for that particular data. The chosen technique may come from the software minimizing some statistic (AIC, BIC, RMSE, etc.). The software will give the optimal parameters of the model, confidence intervals for the forecasts, plots and various statistical summary measures. The user always has the option of bypassing the recommended or chosen methodology and to specify some other technique. The software then gives similar output for the prescribed procedure.
Semiautomatic forecasting software asks the user to specify a methodology from a list of available techniques. The software then finds the optimal parameters for the chosen model, and then gets forecasts, confidence intervals, statistical measures and plots. Finally, manual software requires the user specify both the technique and the parameters for the model. For example, if the user’s time plot of the data shows seasonality, the user may ask for the software to use Winters’ method. The user must also supply the three smoothing constants, and the software then gets the forecasts, plots, statistical measures, etc. Few commercial programs fall into this group because finding the optimal parameters of the model can be a tedious trial and error process.
The user must decide whether to use a dedicated forecasting program or a general statistical product that has the desired forecasting capabilities. Dedicated products are more likely to be automatic programs. They may also offer more sophisticated forecasting techniques (e.g., ARIMA intervention, multivariate ARIMA transfer functions, etc.) than general statistical programs may not have . Most general statistical products do not have an automatic forecasting mode, but a few now do. SPSS has an “expert modeler” that will find the best ARIMA or exponential smoothing model for the data. Statgraphics has an “automatic model selection” option, and its StatAdvisor explains why the software chose that particular model.
Working With the Software
Users report that dedicated forecasting and general statistical products frequently do well at forecasting time series data that has seasonality, with or without trend. Because many recent news topics involved trend analysis, for this survey, I tried something similar but less crucial to the national scene. Every Sunday evening we hear or read which movie made the most money over the weekend. From www.boxofficemojo.com, a Web site that has accurate box office information for more than 10,000 films, I took the daily box office returns of “Alice in Wonderland,” released on March 5 by Buena Vista (Disney) . My question was simple: Considering I had the daily, and thus cumulative domestic gross of the film for its first 23 days (to March 27) of release, how much money will “Alice” eventually gross? Figure 3 shows the time plot of the data. As of this writing (May 20), nearing the end of its domestic run, the movie has made approximately $332 million and is the 19th highest grossing film (domestic figures, not adjusted for inflation) of all time.
I put the data in an Excel spreadsheet. The first column, Time, had integers from 1 to 23. The second column, Alice, had the cumulative gross, in millions of dollars. Most programs can read or import a variety of formats, and all can import Excel worksheets. However, sometimes getting the forecasting software to recognize the Excel spreadsheet can be counterintuitive or even cumbersome. For example, Forecast Pro requires the Excel spreadsheet have initial six rows before the data starts in row seven. Figure 4 shows what the Excel layout should look like before you can import it into Forecast Pro. The first row must have the variable name, the second row has a description of what the variable is, the third row indicates the year the data starts, the fourth gives then the starting period (e.g., for Alice, March or the third month), the fifth has the number of periods per year, and the sixth gives the number of periods per cycle. Thus, it may not be obvious to a new or casual user how to get the forecasting software to recognize the spreadsheet without resorting to the dreaded procedure of reading the user’s guide. Other products import the spreadsheet but ask the user to supply similar information in a separate dialog box.
All the products I tried have this in common: clicking on “Help” gave excellent tutorials or explanations to guide users. With these products – even if you have an aversion to reading PDF manuals – I urge you to at least glance at them.
I strongly recommend that you check if a trial version of the product is available from the vendor’s Web site. All of the general statistical software firms offer a trial program of the complete product that works from 10 to 30 days. Unfortunately, not all of the dedicated forecasting software vendors make a trial version available. If a trial version is available, verify that it allows you to use your own data, and not just the “trial” data that comes with the trial software. While evaluating its capabilities, judge how easy learning the software is, and once you have mastered its nuances, how easy using it is.
Analyzing the Output
Most products give the usual output – forecasts, statistical measures, graphs, etc. – without any prompting. Many will give more, or less, output by the user specifying, via dialog boxes, the specifics. What did the automatic programs say about “Alice in Wonderland”? All the products assumed linear growth. Statgraphics gave Figure 5, Forecast Pro gave Figure 6 and SPSS gave Figure 7. All predicted that the film, by mid-May, would be the highest grossing film of all time. SPSS’ “Expert Modeler” said that it was 95 percent confident that “Alice” could end with gross as high as $1.4 billion or a loss as much as $700 million dollars. What the “Alice in Wonderland” example indicates is that users should know what the program is geared to do if it has an automatic mode. Using the software as a “black box” could lead to reasonable or outrageous forecasts .
In a previous survey [http://viewer.zmags.com/publication/a52b897c#/a52b897c/44], I found that different products gave different forecasts for the same data using the same model. For this survey and using the most current versions of the software, I got similar results. That is, I used another data set that exhibited linear growth and monthly seasonality, and I always told the software to use Winters’ method. The various programs I tried gave different smoothing parameters and thus different forecasts. Why? The software did not use the same initial conditions to find the “optimal” smoothing constants for Winters’ method. Worse, very few of the products tell the user how they determined those initial conditions, or what they were. Thus, the heads-up given in the previous survey applies again.
The author, in conjunction with Analytics and OR/MS Today magazines, recently conducted a new survey of forecasting products. The survey asked the vendor to check off the capabilities and features of the software and allowed he or she to include additional details not addressed by the questions. We tried to identify as many products as possible, using reader and vendor feedback, advertising, displays at professional conferences, information from previous surveys, etc. We e-mailed the vendors and asked them to respond on our online questionnaire, followed by some gentle nudging with subsequent phone calls. The goal was to be as comprehensive as possible to identify and poll the vendors. To those who say, “They left out (my) product X!” please accept our apology. Let us know of the company and product [the questionnaire is available at www.lionhrtpub.com/ancill/fssurvey.shtml], and we will add it to the online directory and listing of products.
The purpose of the survey [see http://lionhrtpub.com/orms/surveys/FSS/fss-fr.html] is to inform the reader of what is available. The information comes from the manufacturers, and no effort was made to verify the submissions. My remarks about specific software should not be construed to be an overall review or evaluation of that program, rather just a few musings of one subject used on one data set.
If you are interested in buying a new forecasting program, or want to try another product from the one you have, you first should examine what techniques the software can do and compare those with your needs. I recommend that the software be at least semi-automatic. Get a trial version if you can. Finally, contact the vendor with your specific questions, and if the Web site does not mention a trial version, bring up the issue with the vendor directly. Users tell me that they found the vendors to be extremely helpful.
Notes and References
- To my anxious family, forecasts for my evil LDL levels through 2019 show good levels hovering around 75 mg/dL, with a peak of 83 in the summer of 2016 and then a slight decay to 77 by the end of 2019.
- An excellent reference text for some of these techniques is “Forecasting Principles and Applications,” by Stephan A. DeLurgio, Mcgraw Hill, 1998.
- Postscript: Figure 3’s time plot of Alice looked like an s-curve to me. I made an Excel template for the Weibull function. With four parameters, this model gives more flexibility than the simple s-curve. Finding the four parameters that will minimize the root mean square error was a nonlinear program for Solver. The template predicted Alice would gross $335 million dollars, approximately $3 million dollars more than the film actually did make.
Jack Yurkiewicz (email@example.com) is a professor of management science in the MBA program at the Lubin School of Business, Pace University, New York. In addition to management science, he teaches data analysis, operations management and simulating financial models. His current interests include developing distance-learning courses for these topics and assessing their effectiveness.