Share with your friends










Submit

Analytics Magazine

Analyze This! Big Data: Generation Next

May/June 2012

Vijay MehrotraBy Vijay Mehrotra

We have all been hearing about both the “the analytics revolution” and “the rise of Big Data” forever, or so it seems. I credit the book “Competing on Analytics” by Thomas H. Davenport and Jeanne G. Harris [Harvard Business School Press, 2007] with making “analytics” part of the mainstream business lexicon. Similarly, the McKinsey Global Institute (MGI) report entitled “Big Data: The next frontier for innovation, competition and productivity,” released in May 2011, has had the same effect for the term “Big Data.”

This MGI report formally defined Big Data as “datasets whose size is beyond the ability of typical database software tools to capture, manage and analyze,” while also identifying several vertical industries and classes of applications that can be improved by intelligent use of data for better decision-making, innovation and competitive advantage. In fact, many of the broad themes presented in this report echo the ideas presented by Davenport and Harris in “Competing on Analytics.” As such, over the past year, it has become natural to think of “analytics” and “Big Data” as being virtually synonymous with one another.

I caught up with Davenport by phone a couple of weeks ago. He was in the midst of a study on the human side of Big Data sponsored by SAS Institute and EMC/Green Plum, and he was kind enough to share some of his findings with me. Over the past few months, he had interviewed a large number of data scientists who were working in Big Data roles in an effort to understand who they are, where they are working and what they are working on. I found some of his observations insightful and others more surprising.

The data scientists who Davenport had spoken with had academic backgrounds in many different disciplines including physics, mathematics, computer science, statistics and operations research, as well as less obvious ones such as meteorology, ecology and several social science fields. Almost all had Ph.D.s, and in many cases their research had been a catalyst for the development of their deep data skills (Davenport cited one recent Ph.D. cohort of seven applied ecology students, of whom six had launched careers in Big Data, rather than academia, after finishing graduate school).

More surprising, however, was Davenport’s observation that “very few large companies are going to bother with ‘first generation’ data scientists.” While pointing to General Electric as a notable exception, he noted that the vast majority of the data scientists who he had found worked at platform companies such as Facebook, Twitter, Google, Yahoo and LinkedIn and at startup companies such as Splunk [1] see exciting entrepreneurial opportunities [2] in creating tools to enable more efficient access, visualization and mining of large streams of data from multiple sources.

“Data management seems to dominate the world of Big Data right now,” Davenport explained. “There’s a huge focus on visualization and reporting among the data scientists I talked to. The statisticians are a little bit frustrated … One of the quips I heard was, ‘Big Data = Little Math.’ ”

His conclusion: today, data-driven managerial decision-making still relies almost exclusively on small-to-medium sized datasets stored in traditional data structures.

I heard some of these same themes at the recent INFORMS Analytics Conference, most notably in a panel discussion on “Innovation and Big Data.” The panelists included Diego Klabjan (Northwestern University), Thomas Olavson (Google), Blake Johnson (Stanford University), Daniel Graham (Teradata) and Michael Zeller (Zementis, Inc).

Very early in the discussion, the panelists all agreed that there’s a huge amount of confusion about what is actually happening in this space today, and that this confusion is being amped up by the massive amount of hype about Big Data (a recent Google search on “Big Data” returns a cool 1,350,000,000 entries, and a quick query on Google Insights for Search reveals that the number of people searching on this term has grown exponentially in the past year [3]). However, as Northwestern’s Klabjan bluntly stated, “OK, with Hadoop we know how to store Big Data. But doing analytics on top of Big Data? We have a long way to go.”

The discussion often touched on the “volume, velocity and variety” [4] of today’s data and the accompanying high level of complexity that leads to a variety of challenges in extracting value from it. Teradata’s Graham acknowledged these risks explicitly when he encouraged executives in the audience to (in the words of Tom Peters) “fail forward fast,” while Google’s Olavson urged the audience to not get so caught up in the complexity of the data challenges and the power of the data management solutions that the key business problems slip out of sight.

The panelists often came back around to the human side of Big Data. Zementis’ Zeller envisioned a future in which the work done by the data scientist of today is broken up into a variety of emerging roles such as data technician and data analyst, while Stanford’s Johnson suggested that the democratization of data would create a need for a quality assurance function for not only the expanding mounds of data but also for the analytic models built on top of it. And Olavson’s final comment was that with or without Big Data, analytics is ultimately about enabling smart people to use data and tools to create business value.

Which brings me back to my earlier conversation with Davenport. At several points in our discussion, he drew a clear distinction between the data scientists of today and the “second generation” of tomorrow. Based on his research, Davenport anticipates that “as more and better data management tools come to market, less software development will be needed to work with Big Data.” In this world, a combination of large, unstructured data management skills and analytic modeling capabilities will be a powerful combination.
It will, I suspect, be here before we know it.

Vijay Mehrotra (vmehrotra@usfca.edu), senior INFORMS member and chair of the ORMS Today and Analytics Committee for INFORMS, is an associate professor, Department of Finance and Quantitative Analytics, School of Business and Professional Studies, University of San Francisco. He is also an experienced analytics consultant and entrepreneur and an angel investor in several successful analytics companies.

REFERENCES, NOTES & FURTHER READING

  1. To read about Splunk’s recent successful IPO, see http://dealbook.nytimes.com/2012/04/19/splunk-soars-in-debut/.
  2. See for example http://www.gsb.stanford.edu/news/headlines/entrepreneur-conference-2012.html.
  3. See http://www.google.com/insights/search/#q=%22Big%20Data%22&cmpt=q.
  4. The three Vs are a popular foundation for Big Data – for more background on this, see http://radar.oreilly.com/2012/01/what-is-big-data.html.

business analytics news and articles

 

Related Posts

  • 52
    Benjamin Franklin offered this sage advice in the 18th century, but he left one key question unanswered: How? How do you successfully drive a business? More specifically, how do you develop the business strategy drivers that incite a business to grow and thrive? The 21st-century solution has proven to be…
    Tags: data, analytics, innovation
  • 51
    July/August 2014 The story of how IBM not only survived but thrived by realizing business value from big data. By (l-r) Brenda Dietrich, Emily Plachy and Maureen Norton This is the story of how an iconic company founded more than a century ago, and once deemed a “dinosaur” that would…
    Tags: analytics, data, big
  • 50
    Organizations of all sizes and types are awash in data possibilities, yet most of them cannot capitalize on the potential for a variety of reasons. The good news, however, is that with the right decisions and focus, these possibilities can turn quickly into realized opportunities.
    Tags: data, analytics, big
  • 50
    Features Visualizing machine-learning analysis In the journey from analysis to data-driven outcomes, data visualization presents data in a powerful and credible way. By Navneet Kesher Calling the smart way Big data analytics increase call center productivity and reduce unwanted phone calls by calling at the right time. By Douglas A.…
    Tags: data, analytics, big, productivity
  • 49
    November/December 2014 Big data needs advanced analytics, but analytics does not need big data. By Eric A. King Thanks big data! Now we’re even more data-rich … yet remain information-poor. After staggering investments motivated by an overabundance of buzz and hype, big data has yet to produce cases that reveal…
    Tags: data, big, analytics


Headlines

Does negative political advertising actually work?

While many potential voters dread campaign season because of pervasive negative political advertising, a new study has found that negative political advertising actually works, but perhaps not in the way that many may assume. The study “A Border Strategy Analysis of Ad Source and Message Tone in Senatorial Campaigns,” which will be published in the June edition of INFORMS’ journal Marketing Science, is co-authored by Yanwen Wang of the University of British Columbia in Vancouver, Michael Lewis of Emory University in Atlanta and David A. Schweidel of Georgetown University in Washington, D.C. Read more →

Meet Summit, world’s most powerful, smartest scientific supercomputer

The U.S. Department of Energy’s Oak Ridge National Laboratory on June 8 unveiled Summit as the world’s most powerful and smartest scientific supercomputer. With a peak performance of 200,000 trillion calculations per second – or 200 petaflops – Summit will be eight times more powerful than ORNL’s previous top-ranked system, Titan. For certain scientific applications, Summit will also be capable of more than three billion billion mixed precision calculations per second, or 3.3 exaops. Read more →

Employee engagement a top concern affecting customer experience

Employee engagement has surfaced as a major concern in delivering improvements in customer experience (CX), with 86 percent of CX executives in a Gartner, Inc. survey ranking it as having an equal or greater impact than other factors such as project management and data skills. “CX is a people issue,” says Olive Huang, research vice president at Gartner. “In some instances, the best technology investments have been derailed by employee factors, such as a lack of training or incentives, low morale or commitment, and poor communication of goals." Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

OTHER EVENTS

Making Data Science Pay
July 30-31, 12:30 p.m.-5 p.m.


Predictive Analytics: Failure to Launch Webinar
Aug. 18, 11 a.m.


Applied AI & Machine Learning | Comprehensive
Sept. 10-13, 17-20 and 24-25


Advancing the Analytics-Driven Organization
Sept. 17-20, 12-5 p.m. LIVE Online


The Analytics Clinic: Ensemble Models: Worth the Gains?
Sept. 20, 11 a.m. -12:30 p.m.

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.