Share with your friends


Analytics Magazine

Integrating data mining and forecasting

March/April 2013

The Dow Chemical approach to leveraging time-series data and demand sensing.

Tim Rey (Left) and Chip WellsBy Tim Rey (Left) and Chip Wells

Big data means different things to different people. In the context of forecasting, the savvy decision-maker needs to find ways to derive value from big data. Data mining for forecasting offers the opportunity to leverage the numerous sources of time series data, both internal and external, now readily available to the business decision-maker, into actionable strategies that can directly impact profitability. Deciding what to make, when to make it and for whom is a complex process. Understanding what factors drive demand, and how these factors (e.g., raw materials, logistics, labor, etc.) interact with production processes or demand and change over time are keys to deriving value in this context.

The Dow Chemical Company was interested in developing an approach for demand sensing that would provide:

Cost reduction

  • reduction in resource expenses for data collection and presentation
  • consistent automated source of data for leading indicator trends

Agility in the Market

  • shifting to external and future looks from internal history
  • broader dissemination of key leading indicator data
  • better timing on market trends … faster price responses, better resource planning (by reducing allocation/force major/share loss on the up side and reducing inventory carrying costs and asset costs on the down side)

Improved accuracy

  • accuracy of timing and estimates for forecast models


  • understanding leading indicator relationships

Figure 1: Levels of hierarchy at Dow Chemical.
Figure 1: Levels of hierarchy at Dow Chemical.

Dow (and its Advanced Analytics team) was keenly interested in better forecasting models for volume (demand), net sales, standard margin, inventory costs, asset utilization and EBIT (earnings before interest and taxes). This was to be done for all businesses and all geographies. Similar to many large corporations, Dow has a complex business/product hierarchy. This hierarchy starts at the top, total Dow, then moves down through divisions, business groups, global business units, value centers, performance centers, etc. As is the case in most large corporations, this hierarchy is always changing and is overlaid with geography. Even lower levels of the hierarchy exist when specific products are considered.

Dow operates in the vast majority of the 16 global market segments as defined in the ISIC (International Standard Industrial Classification) market segment structure, some of which are: agriculture, hunting and forestry, mining and quarrying, manufacturing, electricity, gas and water supply, construction, wholesale and retail trade, hotels and restaurants, transport, storage and communications, health and social work, etc. This includes commodities, differentiated commodities and specialty products and thus makes the mix even more complex. The value chains Dow is involved in are very deep and complex, and often connect the earliest stages of hydrocarbons extraction and production all the way to the consumer on the street.

Figure 2: Dow’s value chains are deep and complex.
Figure 2: Dow’s value chains are deep and complex.

Before embarking on the project, the team contemplated a few “industrial” and economic considerations to attack. First, simply multiplying out the number of models, the team saw that they would have around 7,000 exogenous variable models to build, so we focused on the top global business units (by area combinations in each division, restricting our initial effort to covering 80 percent of net sales). Next, we realized that the target variables of interest (volume, asset utilization, net sales, standard margin, inventory costs and EBIT) are generally related to one another. Thus, volume is a function of volume “drivers” (Vx), represented by f(Vx); asset utilization (AU) is a function of volume and AU “drivers” f(AUx); inventory is a function of volume and inventory (INV) “drivers” f(INVx); net sales is driven by volume, various costs (xcosts) and net sales “drivers” f(NSx); standard margin is driven by net sales and standard margin “drivers” f(SMx); and finally EBIT is driven by standard margin and EBIT “drivers” f(EBITx).

The problem, if done only at one level of the hierarchy, fits into a multivariate in Y approach that could be solved using a VARMAX (vector auto regressive moving average with exogenous variables) system. The complexity here is that we needed to solve the problem across the hierarchy shown above. We proposed that we could mimic the VARMAX structure by building the models in a “daisy chain” fashion shown in Figure 3. As a baseline, we thus compared a traditional VARMAX approach to the daisy chain approach at the total Dow level. We also did a traditional univariate model, as well as a traditional ARIMAX model for each Y. The “Reconciled” column in Table 1 was the daisy chain approach used in the hierarchy (implemented via SAS Forecast Studio) and then reconciled up. Given the results in Table 1, we were confident we could use the daisy chain approach across the hierarchy and get similar benefit to the VARMAX approach. All of the above was accomplished with various SAS forecasting platforms.

Figure 3: Target variables of interest are generally related to one another.
Figure 3: Target variables of interest are generally related to one another.

Table 1: SAS Forecast Studio screen shot.

Table 1: SAS Forecast Studio screen shot.

Following the data mining for forecasting process described in “Applied Data Mining For Forecasting Using SAS” (Rey, Kordon and Wells (2012)) – Chapters 2 and then 7 – which covers exogenous variable identification and then Reduction and Selection for forecasting leads to conducting dozens of mind mapping sessions to have the businesses propose various sets of “drivers” for the numerous GBU and VC by geographic area combinations. This leads to using thousands (more than 15,000 in this case!) of potential exogenous variables of interest for the 7,000 models in the hierarchy. This is truly a big data, large-scale forecasting problem. A lot of automation was necessary for first setting up initial research projects, as well as automatically building initial univariate and daisy chain models.

Lastly, concerning visualization, the business can gain access to these forecasts in a corporate-wide business intelligence delivery system where they can see the history, model, forecast, confidence limits and drivers.

Big data mandates big judgment. Big judgment has to have short “ask-to-answer” cycles. These opportunities call for the use of data mining for forecasting approaches that lead to using special techniques for variable reduction and selection on time series data.

Tim Rey ( is director of Advanced Analytics at The Dow Chemical Company. Fenton (Chip) Wells ( is a statistical services specialist in SAS Education at SAS. They are co-authors of the book, “Applied Data Mining and Forecasting Using SAS.”

business analytics news and articles

Related Posts

  • 51
    The Panama Papers, the unprecedented leak of 11.5 million files from the database of the global law firm Mossack Fonseca, opened up the offshore tax accounts of the rich, famous and powerful – laying bare how they have exploited secretive offshore tax regimes for decades.
    Tags: data, mining, big
  • 50
    A quick quiz: What is a good nine- or 10-letter description of the emerging interest in business analytics and big data that ends in “-al”? A choice that may come to mind for many is “hysterical.” This choice reflects frenzied excitement about opportunities for business analytics to solve problems often…
    Tags: data, business, mining, big
  • 43
    International Data Corporation (IDC) recently released a worldwide Big Data technology and services forecast showing the market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015. This represents a compound annual growth rate (CAGR) of 40 percent or about seven times that of the overall…
    Tags: data, big, market
  • 42
    Organizations of all sizes and types are awash in data possibilities, yet most of them cannot capitalize on the potential for a variety of reasons. The good news, however, is that with the right decisions and focus, these possibilities can turn quickly into realized opportunities.
    Tags: data, business, big
  • 42
    Many organizations have noticed that the data they own and how they use it can make them different than others to innovate, to compete better and to stay in business. That’s why organizations try to collect and process as much data as possible, transform it into meaningful information with data-driven…
    Tags: data, big


Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →



INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.


Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to