Share with your friends


Analytics Magazine

Analyze This! Many moving parts in analytics parade

March/April 2015

Vijay MehrotraBy Vijay Mehrotra

Sometime back in the last century, when I was a disgruntled graduate student, I managed to wangle a part-time job at a semiconductor fabrication facility. My job was to gather data, build simulation models and conduct analyses to help management understand capacity, bottlenecks and cycle times. Along the way I managed to write a few conference papers [1] with some of my colleagues at HP before returning to school full time to pursue a dissertation on queueing networks that was inspired by my work in the fab. I will be forever grateful to Dr. Barclay Tullis for making that opportunity possible for me.

And that’s pretty much the last time I had thought very hard about computer chips. Like most of us, I just basically assumed that they would keep getting faster (and cheaper) at a faster and faster rate, just as Gordon Moore [2] had long ago predicted.

The other day, a recent paper [3] by my colleague Matthew Dixon caught my eye. In the context of parallel computing resources that might be located in any number of different locations (public clouds, private clouds, remote desktops, etc), the thesis of Dixon’s paper is that by using design patterns to structure high-level code in an analytics-friendly software language, data scientists can more effectively identify the computationally intensive steps during the design process. This understanding can in turn be utilized to organize one’s code to leverage the parallel computing resources without having to re-implement the code in a more efficient lower-level language.

The paper illustrates this general process through the use of a case study that involves estimation of option prices, showing how minor modifications to high-level code (in this case, Python) can enable the data scientist to utilize parallelization to radically speed up performance. The crux, as Dixon astutely points out, is the detailed knowledge that the data scientist has about his/her application, information that enables him or her to make smart modularized design choices. In this context, he refers to the data scientist as the Domain Expert for the application (more on this below).

While thinking about these ideas, I had a chat with Ed Rothberg, co-founder and chief operating officer of Gurobi Optimization, a leading provider of software that quickly and efficiently solves linear, quadratic and mixed integer optimization problems. As Ed is an old friend from graduate school days, he patiently answered a long series of naïve questions from me, and in the process provided me with an even richer and more nuanced perspective.

In some sense, Rothberg suggested, I would be well served to think about an optimization solver such as Gurobi’s as a platform that is optimized to efficiently solve a highly structured class of problems while also striving to intelligently utilize detailed knowledge of the available computing resources. This platform, in turn, is the product of a group of developers with extensive knowledge and endless ideas about both the detailed structure of the abstract problem and the architecture and associated logic of the microprocessors that are being utilized to do these calculations. The power is in applying this knowledge and these ideas to abstract representations of ever larger “real-world” problems, because Moore’s Law tells us that the computing power will keep on growing at an exponential rate.

Except that, as both Rothberg and Dixon pointed out to me, Moore’s Law is headed for a cliff, a viewpoint that now seems to be relatively mainstream. Indeed, in a 2013 keynote speech [4] at the Hot Chips conference at Stanford University, former Intel Chief Architect Robert Colwell bluntly predicted that, after a remarkably long period of exponential growth the number of transistors per chip and in the CPU speed produced by those chips, the end of this phenomenon was just a few years down the road. To this relatively non-technical hardware user, this seems to be because of increasingly expensive power and cooling costs associated with so many transistors crammed into such small spaces.

The implications are clear: The concept of smart parallel coding for business applications will be a huge factor in the increasingly data and computationally intensive world of analytics. The work being done by Rothberg and his colleagues at Gurobi (and their competitors) will continue to deliver faster solutions to a large and important class of structured optimization problems. Moreover, Dixon’s broader points about how the use of design patterns and the understanding it engenders about how to exploit the availability of various computing resources is something that analytics professionals will need to become increasingly aware of, as the problems we tackle keep getting bigger at a rate faster than individual chips can be sped up.
I came away from my conversation with Rothberg with a new found respect for the knowledge, experience, ideas and hard work that has been put into the creation of these smart optimization solvers, so I do not have to think about any of the back-end processing when formulating my own representation of the optimization problems that I might encounter. I am as always most grateful for the permission to be ignorant here.

However, when looking at data-intensive problems more generally, I’m going to need to get a bit more intimate with the computational load that I’m generating. And while Dixon’s reference to data scientists as domain experts struck me as a funny one at first (I’ve typically thought of domain experts as people deeply knowledgeable about the business context rather than the analytic representation of it), I now have a much better understanding of what he meant.

Given that the exponential growth in the amount of data being analyzed and the size and significance of the problems being solved, the inevitable end of Moore’s Law and the fact that the holy grail of automated parallelization has yet to be successfully realized, the data scientists of today and tomorrow will have no choice but to be more aware of how their code is executed in heterogeneous parallel environments.

One unexpected takeaway from reading Dixon’s paper was a heightened appreciation for the sheer variety of component disciplines that are being harnessed together in this analytics revolution. The journey to a world of more and smarter data-driven decision-making includes advances in hardware design and manufacturing, software platform components, application development, human-computer interaction design, management policies and controls, and surely many others that I’m blissfully unaware of.

A lot of moving parts in this increasingly large unwieldy parade – and there’s no sign of it slowing down.

Vijay Mehrotra ( is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management. He is also a longtime member of INFORMS.


  1. See for a list of conference papers about the use of simulation to analyze semiconductor fab performance.
    2. See for more on this.
    3. Dixon, M., 2015, “A pattern oriented approach for designing scalable analytics applications,” invited paper, PPAS 2015, Proceedings of the 2nd Annual Conference on Parallel Programming for Analytic Applications, Association for Computing Machinery, p. 4-8 (available online at

business analytics news and articles



Related Posts

  • 53
    New research by Continuum Analytics finds that 96 percent of data science and analytics decision-makers agree that data science is critical to the success of their business, yet a whopping 22 percent are failing to make full use of the data available.
    Tags: data, analytics, scientists, business
  • 52
    Gurobi Optimization recently introduced Gurobi Optimizer v7.0, with higher performance and powerful new modeling capabilities.
    Tags: gurobi, optimization, performance, software, analytics
  • 51
    “Drive thy business or it will drive thee.” Benjamin Franklin offered this sage advice in the 18th century, but he left one key question unanswered: How? How do you successfully drive a business? More specifically, how do you develop the business strategy drivers that incite a business to grow and…
    Tags: data, will, business, analytics
  • 51
    Benjamin Franklin offered this sage advice in the 18th century, but he left one key question unanswered: How? How do you successfully drive a business? More specifically, how do you develop the business strategy drivers that incite a business to grow and thrive? The 21st-century solution has proven to be…
    Tags: data, analytics, will, business
  • 48
    The CUNY School of Professional Studies is offering a new online master of science degree in data analytics. The program prepares its graduates for high-demand and fast-growing careers as data analysts, data specialists, business intelligence analysts, information analysts and data engineers in such fields as business, operations, marketing, social media,…
    Tags: data, analytics, business


Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →



INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.


Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to