Share with your friends


Analytics Magazine

ANALYZE THIS! Problem-solving: Keep it real with gemba

Vijay MehrotraBy Vijay Mehrotra

Some years ago, I got a call from “Frank,” a finance director at a start-up company with a cloud-based software solution. Its platform was hosted by one of the large public cloud providers, and that was why he was calling. “The bills for hosting have been outrageous,” said Frank, who had called me based on a strong referral from a former consulting colleague. “Charges are rising at a ferocious rate, much faster than revenue. As far as data, we do get a bill from our provider with a ton of detail in it, but we are having a hard time getting our arms around what’s driving the peaks in the traffic loads. We are really worried about managing these costs as we grow.”

As it happens, I have a background in queueing and experience with load forecasting in many different industries, so this project seemed to be a good fit with my background. Wrapping up this initial conversation, Frank suggested that the right next step was a follow-up conversation with “Oscar,” the director of operations from the company’s engineering organization. Later that day, Frank sent an introductory email to the two of us, strongly suggesting that Oscar make some time to bring me up to speed on the issues they were facing.

On Friday afternoon of that week, I spoke with Oscar for the first time. For much of our conversation he seemed to be speaking a foreign language. He talked about customer pods, network latency, CPU densities, API calls and user counts. He quickly explained the structure of their data warehouse, with obtuse references to fields with cryptic names and unclear definitions. Overall, his tone suggested that he doubted that an accurate cost prediction model could be developed by anyone, much less an outsider with no experience with cloud computing architecture.

Over the next couple of weeks, there were several follow-up conversations with Frank’s finance folks, Oscar’s ops team and people from other parts of the company. Sadly, within a month or so, the project was abandoned. My sense was that it was basically my fault for not knowing enough about the domain.

As data sets continue to grow larger and larger, there are more and more specialized individuals and teams dedicated to searching for patterns that can be exploited for one business purpose or another. If this is what your current job or project is like, perhaps you think that understanding the business context is not all that important.

My perspective is different. After losing out on the project opportunity with Frank and Oscar, I am now an even bigger advocate of going to “gemba,” a term that is very familiar to practitioners of various lean and quality management methodologies. In Japanese, the literal meaning of gemba is “the real place,” in contrast to the simplified models of other people’s reality with which most of us are more comfortable. Gemba is usually messy.

The late Gene Woolsey, professor of operations research at the Colorado School of Mines, believed that this trip to gemba should be more than a stop at the drive-thru window [1]. Each of his graduate students was required to learn to do the job(s) associated with a particular process or system before they would be allowed to develop a model that purported to improve it.  Woolsey saw this as essential to any successful project, both for the domain knowledge that would be acquired in the process of learning these roles and for the credibility and trust that need to be established for most solutions to be accepted. When I first heard about Woolsey, the requirements he placed on his students seemed extreme and unnecessary to me. It took me years to understand that this type of firsthand engagement was usually incredibly valuable.

Gemba, Part 1: The (real) problem. Most of the time, the description of the business problem that you first receive at the start of the project is either incorrect, incomplete or both. This is not because the people who provided the original problem statement to you are fools and/or scoundrels, but rather because any snapshot that is taken is influenced by the position, time and scope at which the picture is taken. While the available data may offer some insights, the investigation of the business context, particularly through structured face-to-face conversations with key participants early in the process, can provide vital clues.

A friend of mine once received a call from a manufacturer asking us to look at their historical data to develop an improved model for forecasting market prices for key inputs. Through initial discussions, however, what he discovered was that the company’s real need was for a model to support its sourcing group to negotiate contracts with suppliers. This discovery led to a successful project, albeit one that was quite different than what had first been described by the project sponsor.

Gemba, Part 2: The (real) context for the data. With most analytics projects, there is usually no way to get to know the data without taking the time to really understand the underlying business context. Without taking the time to understand the processes that drive the data capture, you can easily be confused or worse, misled, by unclear field definitions, underlying capture logic and time and/or space dimensionality or many other foundational concepts. Without understanding the broader context, you can often incorrectly identify data points as outliers. And without some understanding of the context, there may be an infinite number of initial visualizations that you can create, but a much smaller number that can help you to develop good hypotheses to investigate and/or to make good decisions about what to include or exclude from your model.

Gemba, Part 3:  Credibility and Trust (the real “real place”). Whether you are a consultant, a new hire or a project team from another part of the organization, it is possible that those whose “problem” you have been asked to investigate will view you with some suspicion. Most likely the decision to hire you was made by someone else and thrust upon them, an odd kind of arranged marriage.

From the perspective of the people you have been sent to “help,” you do not know them nor do you know their business. Your project is sure to disrupt their lives – and you surely will not be around to clean up the mess you have left them.

How to deal with this particularly vexing challenge? We will dig into that next time.

Vijay Mehrotra ( is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management and a longtime member of INFORMS.


  1. For much more of the wisdom of Woolsey, see

Analytics data science news articles

Analytics data science news articlesSave


Related Posts

  • 81
    My family and I are spending the second half of my sabbatical in Madrid. We arrived in Spain a couple of weeks ago, and the day we arrived I tried to log in to my bank’s online site to pay my credit card bill.
    Tags: data, ethics, professional, science, big
  • 71
    Cathy O’Neil, an industry insider and experienced expert, thoroughly covers the sociological downside of data science in her New York Times bestseller and first-of-its kind book, “Weapons of Math Destruction.” In the world of big data, there’s a lot of music to be faced. With all its upside, data science’s…
    Tags: data, science, good, professional, ethics, big
  • 61
    Many organizations have noticed that the data they own and how they use it can make them different than others to innovate, to compete better and to stay in business. That’s why organizations try to collect and process as much data as possible, transform it into meaningful information with data-driven…
    Tags: data, science, big
  • 49
    International Data Corporation (IDC) recently released a worldwide Big Data technology and services forecast showing the market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015. This represents a compound annual growth rate (CAGR) of 40 percent or about seven times that of the overall…
    Tags: data, big, will
  • 47
    New research by Continuum Analytics finds that 96 percent of data science and analytics decision-makers agree that data science is critical to the success of their business, yet a whopping 22 percent are failing to make full use of the data available.
    Tags: data, science, business


Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →



INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.


Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to