Share with your friends


Analytics Magazine

Integrating data mining and forecasting

March/April 2013

The Dow Chemical approach to leveraging time-series data and demand sensing.

Tim Rey (Left) and Chip WellsBy Tim Rey (Left) and Chip Wells

Big data means different things to different people. In the context of forecasting, the savvy decision-maker needs to find ways to derive value from big data. Data mining for forecasting offers the opportunity to leverage the numerous sources of time series data, both internal and external, now readily available to the business decision-maker, into actionable strategies that can directly impact profitability. Deciding what to make, when to make it and for whom is a complex process. Understanding what factors drive demand, and how these factors (e.g., raw materials, logistics, labor, etc.) interact with production processes or demand and change over time are keys to deriving value in this context.

The Dow Chemical Company was interested in developing an approach for demand sensing that would provide:

Cost reduction

  • reduction in resource expenses for data collection and presentation
  • consistent automated source of data for leading indicator trends

Agility in the Market

  • shifting to external and future looks from internal history
  • broader dissemination of key leading indicator data
  • better timing on market trends … faster price responses, better resource planning (by reducing allocation/force major/share loss on the up side and reducing inventory carrying costs and asset costs on the down side)

Improved accuracy

  • accuracy of timing and estimates for forecast models


  • understanding leading indicator relationships

Figure 1: Levels of hierarchy at Dow Chemical.
Figure 1: Levels of hierarchy at Dow Chemical.

Dow (and its Advanced Analytics team) was keenly interested in better forecasting models for volume (demand), net sales, standard margin, inventory costs, asset utilization and EBIT (earnings before interest and taxes). This was to be done for all businesses and all geographies. Similar to many large corporations, Dow has a complex business/product hierarchy. This hierarchy starts at the top, total Dow, then moves down through divisions, business groups, global business units, value centers, performance centers, etc. As is the case in most large corporations, this hierarchy is always changing and is overlaid with geography. Even lower levels of the hierarchy exist when specific products are considered.

Dow operates in the vast majority of the 16 global market segments as defined in the ISIC (International Standard Industrial Classification) market segment structure, some of which are: agriculture, hunting and forestry, mining and quarrying, manufacturing, electricity, gas and water supply, construction, wholesale and retail trade, hotels and restaurants, transport, storage and communications, health and social work, etc. This includes commodities, differentiated commodities and specialty products and thus makes the mix even more complex. The value chains Dow is involved in are very deep and complex, and often connect the earliest stages of hydrocarbons extraction and production all the way to the consumer on the street.

Figure 2: Dow’s value chains are deep and complex.
Figure 2: Dow’s value chains are deep and complex.

Before embarking on the project, the team contemplated a few “industrial” and economic considerations to attack. First, simply multiplying out the number of models, the team saw that they would have around 7,000 exogenous variable models to build, so we focused on the top global business units (by area combinations in each division, restricting our initial effort to covering 80 percent of net sales). Next, we realized that the target variables of interest (volume, asset utilization, net sales, standard margin, inventory costs and EBIT) are generally related to one another. Thus, volume is a function of volume “drivers” (Vx), represented by f(Vx); asset utilization (AU) is a function of volume and AU “drivers” f(AUx); inventory is a function of volume and inventory (INV) “drivers” f(INVx); net sales is driven by volume, various costs (xcosts) and net sales “drivers” f(NSx); standard margin is driven by net sales and standard margin “drivers” f(SMx); and finally EBIT is driven by standard margin and EBIT “drivers” f(EBITx).

The problem, if done only at one level of the hierarchy, fits into a multivariate in Y approach that could be solved using a VARMAX (vector auto regressive moving average with exogenous variables) system. The complexity here is that we needed to solve the problem across the hierarchy shown above. We proposed that we could mimic the VARMAX structure by building the models in a “daisy chain” fashion shown in Figure 3. As a baseline, we thus compared a traditional VARMAX approach to the daisy chain approach at the total Dow level. We also did a traditional univariate model, as well as a traditional ARIMAX model for each Y. The “Reconciled” column in Table 1 was the daisy chain approach used in the hierarchy (implemented via SAS Forecast Studio) and then reconciled up. Given the results in Table 1, we were confident we could use the daisy chain approach across the hierarchy and get similar benefit to the VARMAX approach. All of the above was accomplished with various SAS forecasting platforms.

Figure 3: Target variables of interest are generally related to one another.
Figure 3: Target variables of interest are generally related to one another.

Table 1: SAS Forecast Studio screen shot.

Table 1: SAS Forecast Studio screen shot.

Following the data mining for forecasting process described in “Applied Data Mining For Forecasting Using SAS” (Rey, Kordon and Wells (2012)) – Chapters 2 and then 7 – which covers exogenous variable identification and then Reduction and Selection for forecasting leads to conducting dozens of mind mapping sessions to have the businesses propose various sets of “drivers” for the numerous GBU and VC by geographic area combinations. This leads to using thousands (more than 15,000 in this case!) of potential exogenous variables of interest for the 7,000 models in the hierarchy. This is truly a big data, large-scale forecasting problem. A lot of automation was necessary for first setting up initial research projects, as well as automatically building initial univariate and daisy chain models.

Lastly, concerning visualization, the business can gain access to these forecasts in a corporate-wide business intelligence delivery system where they can see the history, model, forecast, confidence limits and drivers.

Big data mandates big judgment. Big judgment has to have short “ask-to-answer” cycles. These opportunities call for the use of data mining for forecasting approaches that lead to using special techniques for variable reduction and selection on time series data.

Tim Rey ( is director of Advanced Analytics at The Dow Chemical Company. Fenton (Chip) Wells ( is a statistical services specialist in SAS Education at SAS. They are co-authors of the book, “Applied Data Mining and Forecasting Using SAS.”

business analytics news and articles

Related Posts

  • 51
    The Panama Papers, the unprecedented leak of 11.5 million files from the database of the global law firm Mossack Fonseca, opened up the offshore tax accounts of the rich, famous and powerful – laying bare how they have exploited secretive offshore tax regimes for decades.
    Tags: data, mining, big
  • 50
    A quick quiz: What is a good nine- or 10-letter description of the emerging interest in business analytics and big data that ends in “-al”? A choice that may come to mind for many is “hysterical.” This choice reflects frenzied excitement about opportunities for business analytics to solve problems often…
    Tags: data, business, mining, big
  • 43
    International Data Corporation (IDC) recently released a worldwide Big Data technology and services forecast showing the market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015. This represents a compound annual growth rate (CAGR) of 40 percent or about seven times that of the overall…
    Tags: data, big, market
  • 42
    Organizations of all sizes and types are awash in data possibilities, yet most of them cannot capitalize on the potential for a variety of reasons. The good news, however, is that with the right decisions and focus, these possibilities can turn quickly into realized opportunities.
    Tags: data, business, big
  • 42
    Many organizations have noticed that the data they own and how they use it can make them different than others to innovate, to compete better and to stay in business. That’s why organizations try to collect and process as much data as possible, transform it into meaningful information with data-driven…
    Tags: data, big

Analytics Blog

Electoral College put to the math test

With the campaign two months behind us and the inauguration of Donald Trump two days away, isn’t it time to put the 2016 U.S. presidential election to bed and focus on issues that have yet to be decided? Of course not.


Survey: Despite the hype, AI adoption still in early stages

The hype surrounding artificial intelligence (AI) is intense, but for most European businesses surveyed in a recent study by SAS, adoption of AI is still in the early or even planning stages. The good news is, the vast majority of organizations have begun to talk about AI, and a few have even begun to implement suitable projects. There is much optimism about the potential of AI, although fewer were confident that their organization was ready to exploit that potential. Read more →

Data professionals spend almost as much time prepping data as analyzing it

Nearly 40 percent of data professionals spend more than 20 hours per week accessing, blending and preparing data rather than performing actual analysis, according to a survey conducted by TMMData and the Digital Analytics Association. More than 800 DAA community members participated in the survey held earlier this year. The survey revealed that data access, quality and integration present persistent, interrelated roadblocks to efficient and confident analysis across industries. Read more →



2017 Winter Simulation Conference (WSC 2017)
Dec. 3-6, 2017, Las Vegas


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to