Share with your friends


Analytics Magazine

Integrating data mining and forecasting

March/April 2013

The Dow Chemical approach to leveraging time-series data and demand sensing.

Tim Rey (Left) and Chip WellsBy Tim Rey (Left) and Chip Wells

Big data means different things to different people. In the context of forecasting, the savvy decision-maker needs to find ways to derive value from big data. Data mining for forecasting offers the opportunity to leverage the numerous sources of time series data, both internal and external, now readily available to the business decision-maker, into actionable strategies that can directly impact profitability. Deciding what to make, when to make it and for whom is a complex process. Understanding what factors drive demand, and how these factors (e.g., raw materials, logistics, labor, etc.) interact with production processes or demand and change over time are keys to deriving value in this context.

The Dow Chemical Company was interested in developing an approach for demand sensing that would provide:

Cost reduction

  • reduction in resource expenses for data collection and presentation
  • consistent automated source of data for leading indicator trends

Agility in the Market

  • shifting to external and future looks from internal history
  • broader dissemination of key leading indicator data
  • better timing on market trends … faster price responses, better resource planning (by reducing allocation/force major/share loss on the up side and reducing inventory carrying costs and asset costs on the down side)

Improved accuracy

  • accuracy of timing and estimates for forecast models


  • understanding leading indicator relationships

Figure 1: Levels of hierarchy at Dow Chemical.
Figure 1: Levels of hierarchy at Dow Chemical.

Dow (and its Advanced Analytics team) was keenly interested in better forecasting models for volume (demand), net sales, standard margin, inventory costs, asset utilization and EBIT (earnings before interest and taxes). This was to be done for all businesses and all geographies. Similar to many large corporations, Dow has a complex business/product hierarchy. This hierarchy starts at the top, total Dow, then moves down through divisions, business groups, global business units, value centers, performance centers, etc. As is the case in most large corporations, this hierarchy is always changing and is overlaid with geography. Even lower levels of the hierarchy exist when specific products are considered.

Dow operates in the vast majority of the 16 global market segments as defined in the ISIC (International Standard Industrial Classification) market segment structure, some of which are: agriculture, hunting and forestry, mining and quarrying, manufacturing, electricity, gas and water supply, construction, wholesale and retail trade, hotels and restaurants, transport, storage and communications, health and social work, etc. This includes commodities, differentiated commodities and specialty products and thus makes the mix even more complex. The value chains Dow is involved in are very deep and complex, and often connect the earliest stages of hydrocarbons extraction and production all the way to the consumer on the street.

Figure 2: Dow’s value chains are deep and complex.
Figure 2: Dow’s value chains are deep and complex.

Before embarking on the project, the team contemplated a few “industrial” and economic considerations to attack. First, simply multiplying out the number of models, the team saw that they would have around 7,000 exogenous variable models to build, so we focused on the top global business units (by area combinations in each division, restricting our initial effort to covering 80 percent of net sales). Next, we realized that the target variables of interest (volume, asset utilization, net sales, standard margin, inventory costs and EBIT) are generally related to one another. Thus, volume is a function of volume “drivers” (Vx), represented by f(Vx); asset utilization (AU) is a function of volume and AU “drivers” f(AUx); inventory is a function of volume and inventory (INV) “drivers” f(INVx); net sales is driven by volume, various costs (xcosts) and net sales “drivers” f(NSx); standard margin is driven by net sales and standard margin “drivers” f(SMx); and finally EBIT is driven by standard margin and EBIT “drivers” f(EBITx).

The problem, if done only at one level of the hierarchy, fits into a multivariate in Y approach that could be solved using a VARMAX (vector auto regressive moving average with exogenous variables) system. The complexity here is that we needed to solve the problem across the hierarchy shown above. We proposed that we could mimic the VARMAX structure by building the models in a “daisy chain” fashion shown in Figure 3. As a baseline, we thus compared a traditional VARMAX approach to the daisy chain approach at the total Dow level. We also did a traditional univariate model, as well as a traditional ARIMAX model for each Y. The “Reconciled” column in Table 1 was the daisy chain approach used in the hierarchy (implemented via SAS Forecast Studio) and then reconciled up. Given the results in Table 1, we were confident we could use the daisy chain approach across the hierarchy and get similar benefit to the VARMAX approach. All of the above was accomplished with various SAS forecasting platforms.

Figure 3: Target variables of interest are generally related to one another.
Figure 3: Target variables of interest are generally related to one another.

Table 1: SAS Forecast Studio screen shot.

Table 1: SAS Forecast Studio screen shot.

Following the data mining for forecasting process described in “Applied Data Mining For Forecasting Using SAS” (Rey, Kordon and Wells (2012)) – Chapters 2 and then 7 – which covers exogenous variable identification and then Reduction and Selection for forecasting leads to conducting dozens of mind mapping sessions to have the businesses propose various sets of “drivers” for the numerous GBU and VC by geographic area combinations. This leads to using thousands (more than 15,000 in this case!) of potential exogenous variables of interest for the 7,000 models in the hierarchy. This is truly a big data, large-scale forecasting problem. A lot of automation was necessary for first setting up initial research projects, as well as automatically building initial univariate and daisy chain models.

Lastly, concerning visualization, the business can gain access to these forecasts in a corporate-wide business intelligence delivery system where they can see the history, model, forecast, confidence limits and drivers.

Big data mandates big judgment. Big judgment has to have short “ask-to-answer” cycles. These opportunities call for the use of data mining for forecasting approaches that lead to using special techniques for variable reduction and selection on time series data.

Tim Rey ( is director of Advanced Analytics at The Dow Chemical Company. Fenton (Chip) Wells ( is a statistical services specialist in SAS Education at SAS. They are co-authors of the book, “Applied Data Mining and Forecasting Using SAS.”

business analytics news and articles

Related Posts

  • 51
    The Panama Papers, the unprecedented leak of 11.5 million files from the database of the global law firm Mossack Fonseca, opened up the offshore tax accounts of the rich, famous and powerful – laying bare how they have exploited secretive offshore tax regimes for decades.
    Tags: data, mining, big
  • 50
    A quick quiz: What is a good nine- or 10-letter description of the emerging interest in business analytics and big data that ends in “-al”? A choice that may come to mind for many is “hysterical.” This choice reflects frenzied excitement about opportunities for business analytics to solve problems often…
    Tags: data, business, mining, big
  • 43
    International Data Corporation (IDC) recently released a worldwide Big Data technology and services forecast showing the market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015. This represents a compound annual growth rate (CAGR) of 40 percent or about seven times that of the overall…
    Tags: data, big, market
  • 42
    Organizations of all sizes and types are awash in data possibilities, yet most of them cannot capitalize on the potential for a variety of reasons. The good news, however, is that with the right decisions and focus, these possibilities can turn quickly into realized opportunities.
    Tags: data, business, big
  • 42
    Many organizations have noticed that the data they own and how they use it can make them different than others to innovate, to compete better and to stay in business. That’s why organizations try to collect and process as much data as possible, transform it into meaningful information with data-driven…
    Tags: data, big


Study: The magic of animated movies not tied to latest technology

In the nearly 60 years between the 1939 release of Hollywood’s first full-length animated movie, “Snow White and the Seven Dwarfs” and modern hits like “Toy Story,” “Shrek” and more, advances in animation technology have revolutionized not only animation techniques, but moviemaking as a whole. However, a new study in the INFORMS journal Organization Science found that employing the latest technology doesn’t always ensure creative success for a film. Read more →

Six finalists named for Edelman Award

INFORMS selected a diverse group of six finalists for the 47th annual Franz Edelman Award for Achievements in Operations Research and Management Science, the world’s most prestigious award for achievement in the practice of analytics and O.R. The 2018 finalists, who will present their work before a panel of judges at the INFORMS Conference on Analytics & Operations Research in Baltimore on April 15-17, included innovative applications in broadcasting, healthcare, communication, inventory management, vehicle fleet management and alternative energy. Read more →

Are Super Bowl ads worth it? New research suggests benefits persist

On Feb. 4, more than 40 percent of U.S. households will watch the 2018 Super Bowl game on TV. Advertisers will pay up to $4 million for a 30-second spot during the telecast. Is the high cost of advertising worth it? A new study finds that the benefits from Super Bowl ads persist well into the year with increased sales during other sporting events. Further, the research finds that the gains in sales are much more substantial when the advertiser is the sole advertiser from its market category or niche in a particular event. Read more →



2018 INFORMS Conference on Business Analytics and Operations Research
April 15-17, 2018, Baltimore


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to