Big Data Buzzkill: Goal-driven analytics
Big data needs advanced analytics, but analytics does not need big data.
By Eric A. King
Thanks big data! Now we’re even more data-rich … yet remain information-poor. After staggering investments motivated by an overabundance of buzz and hype, big data has yet to produce cases that reveal substantial verified return. Organizations are becoming harder pressed to show value, but they’re not sure where or how to draw it.
Professor Dan Ariely of Duke University relates big data to teenage sex:
“Everyone talks about it; no one really knows how to do it,” he says. “Everyone thinks everyone else is doing it; so everyone claims they’re doing it.” In an article from the December 2013 Harvard Business Review by Jeanne W. Ross, Cynthia M. Beath and Anne Quaadgras, the very title suggests that “You May Not Need Big Data After All.” The authors rightfully argue that even before big data, most companies did not make productive use of valuable information already at hand. So, jumping ahead to big data is like attempting to operate a jet fighter before gaining proficiency in a sedan. It’s no wonder why big data is proceeding rapidly into the third stage of Gartner’s Technology Hype Cycle : (dark scary voice) the “trough of disillusionment.”
The practice of big data overall has its merit and will not go away. It requires that organizations actively think about how to accommodate rapidly increasing volumes and varieties of data. Yet, only the companies that successfully implement predictive analytics and effectively act upon the value-laden information mined from their large data stores will enjoy early and sizable returns. In fact, advanced analytics is arguably the only way in which bottom-lined accountable and residual payback from big data will be extracted.
The standard big data practice of collecting, storing, transporting, connecting, organizing, extracting and even visualizing rapid streams of data is essentially a cost center activity. Only when content of value is operationalized into active decisioning and measured for impact will big data’s liability be converted into an intelligence asset. Big data’s recovery up the Hype Cycle  “Slope of Enlightenment” will come in the form of actionable analytics for automated decision-making at the operational level and proactive recommendations at the strategic level.
Size and Success Don’t Correlate
Big data enthusiasts are finding that the more data they collect, the harder it becomes to understand just what the data is telling them. And most practitioners are surprised to learn how little data is required to build a highly effective goal-driven model. It’s not a matter of having a lot of data, but a valid sampling of data to support the target objective.
For advanced analytics, it is far more important for a database to be wide with attributes or variables than long in transactions. Thanks to big data innovations, more variables are being collected than ever before. In fact, data dictionaries are starting to be turned on their side to allow vertical scrolling through a growing number of attributes.
Only variables that have no relationship to the target objective should be excluded. A development model will automatically rank the limited set of variables that have predictive value toward the objective. The remainder can be eliminated from the final model and potentially from the analytic sandbox.
Only enough transactional data to adequately represent the solution space for the application at hand is required to develop the model. There are standard rules of thumb based on the final number of attributes or dimensionality of the final model that suggest the number of records or transactions needed to derive the train, test and validation data sets for model development. Most times, this range is from 5,000 to 250,000 records – a mere quark in the vast universe of big data.
But without a use plan for data, companies feel at risk to not harvest all possible data. This digital hoarding overwhelms analysis and motivates strategies for deriving streamlined analytic sandboxes. The sandboxes draw targeted data for goal-driven model development from the vast stores of useless “dark data.”
One other consideration toward limiting data for more streamlined analytics is to start with available structured data. In most organizations, structured data holds far more predictive value and requires far less preparation labor than open text. Why jump straight into drilling sideways for limited resources if there are pressure wells to tap at the surface?
Big Data 2.0 Must Progress to Analytics 3.0
The International Institute for Analytics and Thomas Davenport rightfully relegate big data into a 2.0 version on its analytic maturity model behind the purpose-driven Analytics 3.0 . The timeline denotes traditional analytics as the first stage, big data as the second and actionable analytics as the third. Organizations that progress quickly to Analytics 3.0 to combine traditional analytics, machine learning, big data, goal-driven strategy and embedded predictive decisioning at the operational level will become leaders that achieve measurable returns.
Yet analytically, most organizations are working on the wrong end of the problem. Instead of taking a strategic, goal-driven approach, they are proceeding with a technology focus. They are hiring “data scientists” and extending their technical capability to perform more sophisticated analyses. This approach will fail or fall short at the business level for a host of strategic reasons. They may build technically superior models that conform well to artificial metrics. But their optimized models won’t align with business objectives, achieve overall performance metrics, integrate with the operational environment, gain adoption by users, integrate effectively into operations or be translated in terms that leadership can apply.
The analytic industry is attempting to define “data scientist” as a dynamic analytic practitioner who holds advanced analytic skills, vast IT experience and managerial soft skills to oversee analytic processes at the project level. Not only is this superstar mix of technical skill and leadership personality extremely rare, but the term “data scientist” itself faces multiple challenges.
On one hand, many amateur business practitioners are loosely donning the label, diluting its reputation. On the other, “scientist” suggests a formal discipline and deep vertical experience along with a research component. This is probably the most fitting definition for “data scientist.” Yet those technical skills alone won’t achieve Analytics 3.0 objectives.
These technology-driven formal data scientists typically view strategic assessment and project planning as theoretical fluff. They jump directly into the trees with little regard to the forest – delighting in writing increasingly complex code, creating ever more sophisticated algorithms, and then wondering why business leaders don’t implement their findings more readily. If these trends continue, then the title “data scientist” will only live up to its label of a theoretical quantitative specialist and fail to have strategic or even operational impact.
The majority of companies don’t realize that common business practitioners can leverage modern predictive modeling software that encapsulates the complexity of machine learning to quickly build “more than adequate models.” This can be done in conjunction with an analytic support team. Beyond IT and the business owner, the team should include strategic oversight from a seasoned senior consultant who can collaboratively develop an overarching optimized modeling process.
The resulting process will follow the blueprints developed from the information amassed in the assessment. The process will not only ensure optimal deployment of the model, but roadmap and tailor all actions from operating within the sandbox, to data preparation, model development, deployment, validation, reporting and model lifecycle management. In the end, Analytics 3.0 will lead organizations to shift their thinking from tactics and technology to strategy and measured impact.
Strategic Implementation is Imperative
Most leadership today does not realize how expensive it is in the long run to insist upon immediate results and instant payback. Instead of investing time to design and build a modeling factory, they choose to manufacture each new analytic product as custom and demand delivery within nearly impossible timeframes.
With big data and analytics, industry typically devalues comprehensive assessment and tailored project design – opting for immediate summaries or projections. Organizations continue to draw little value from disparate, ad hoc analyses that produce some nice-to-know insights, but fall short of driving goal-driven decisions that translate impact back to leadership.
It is a common practice to request case studies to evaluate vendors and technology. But it’s misguided, as each implementation is highly situational and based on a multitude of contributors. Just because 10 similar organizations realized substantial gains does not mean that your team is even at the starting line. Case summaries convey very little about process and project design issues that are critical to achieving overall project success. They are simply indirect justifications that the technology can actually generate substantial returns in the right situations.
For analytics to arrive at tangible and residual value, many questions need to be addressed in advance of implementation. A number of public domain industry-standard processes outline specific strategic phases, tasks and issues to be examined before even exploring any data. Yet the vast majority of practitioners fail to reference it and instead jump headlong into the data. They are unaware that the most critical pitfalls relate to a lack of soft skills – not analytic limitations. Following are just a few strategic considerations that sound obvious, yet are rarely ever addressed prior to model development or process implementation:
- Buy-in: Is leadership onboard? Are they motivated or ambivalent? Do they view analytics as an esoteric and theoretical function? Or have they heard enough industry buzz to be seriously concerned about their analytic maturity and vulnerability?
- Team capability: Do team members appreciate the importance of strategic implementation and project design effort? Will they understand why project definition is imperative to project success – and why building a structure without sound design and blueprints is likely to fail?
- Politics: What is the make-up of the overall team that will contribute to or be impacted by analytics? Has each member been interviewed for their role, experience, objectives, motivations and concerns? Are there competing interests? Threats? For example, are traditional statisticians firmly resistant to shifting their mindset to a more strategic and agile model-building focus? Likewise, without the oversight of experienced modelers, will egregious errors be made in data preparation and results interpretation?
- Alignment of objectives: Has each team member impacted by the analytic process been asked about their individual goals? Have you drilled at least two or three levels deep? Often the surface issue expressed is not the true underlying concern.
- Performance projections: Have baselines for current performance been established along with target performance and its impact on operations? Without the baseline and target, how will success of the analytic initiative be defined, measured and interpreted?
- Ability to affect: Does the organization have the willingness and wherewithal to carry out potential model recommendations? If not, we haven’t passed the “so what” test. It is far less expensive to determine in advance of a modeling objective that the organization “can’t handle the truth.”
- Decision culture: Does your company drive more from general leadership experience and feel, or evidence-based decisioning? If the former, is leadership open to letting go of one handlebar and allow a pilot to compete in a series of A/B tests?
- Cost of status quo: Referencing back to “Buy-in” at the top of the list, considering the ultimate cost of doing nothing is often what gets leadership off the fence. Leadership need not be analytically literate to appreciate that supporting costly big data initiatives does not make sense unless a more capable and purposeful analytic practice is prepared to leverage it.
The information amassed from these and many other strategic and tactical considerations is used to prepare a highly tailored analytic project design. The resulting process supports agile model development by functional managers and business practitioners. This is the engine required to generate measurable benefit from big data.
Goal-Driven Analytics Will Justify Big Data
Until leadership grants analytic teams the six to eight weeks to assess and design tailored analytic processes that will rapidly produce analytic models to support specific business targets, data analysis will continue to be a theoretical practice that produces little more than interesting insights and isolated low-value remedies. The vast majority of companies will remain analytically immature and dysfunctional. This creates a significant competitive opportunity for those who invest in formal strategic assessment and design.
Here are the primary takeaways:
1. Don’t wait for big data to stand up. It’s a journey and not a destination. Analytics can start bringing value at any stage of a big data implementation and even help justify further big data investment.
2. Get trained. Seek a vendor-neutral trainer that not only provides methods and tactics, but has a specific focus on project-level strategic implementation.
3. Comprehensive assessment. It is infinitely more effective to select the most viable and valuable modeling project after having surveyed leadership, team members, resources and the environment than to perform great work on a doomed initiative or start sifting for insights without a performance target.
4. Conduct an underground pilot. If the initial results fall short, that’s part of the overall discovery process. Shift and cycle again. If they exceed, then market to leadership and expand.
5. Ongoing strategic oversight. Seek the guidance of a seasoned mentor. This consultant will have the experience to anticipate hurdles, overcome elusive pitfalls and provide a low-risk/high-reward roadmap to greater returns in a shorter time frame.
Without a formal and comprehensive assessment performed by a senior strategic analytic consultant, organizations will continue to perform analysis for the sake of analysis. The results of this practice will uncover some discoveries of interest that rarely align with business objectives or translate to impact.
Instead, goal-directed analysis driven by a methodical assessment and tailored project design lifts a specific business objective by a measurable margin. Of course, this is what translates well for leadership and puts data productively to work, whether big or small.
Eric A. King is the president and founder of The Modeling Agency, LLC, an advanced analytics training and consulting company providing strategic guidance and impactful results for the data-rich yet information-poor. King is a co-presenter in a monthly live, interactive analytics webinar entitled “Data Mining: Failure to Launch.” He may be reached at (281) 667-4200 x210 or firstname.lastname@example.org.
The author thanks Carla Gentry of Analytical Solution for granting permission to use a slight variation of a fantastic blog phrase as the title of this article. Also, cheers to Professor Dan Ariely of Duke University for permission to quote his hilariously accurate teenage sex analogy for big data. Gratitude is extended to the International Institute for Analytics (IIA) and Tom Davenport to reference Analytics 3.0 and related IIA material in this article and TMA courseware with attribution. And finally, a gracious nod to Sandra Hendren, senior consultant at The Modeling Agency, for her review and brilliant edits.
- Gartner, Inc., “Hype Cycles 2013 Research Report,” Gartner Technology Research, 2013 (https://www.gartner.com/doc/2571624).
- Thomas H. Davenport. “Analytics 3.0,” Harvard Business Review (http://hbr.org/2013/12/analytics-30), December 2013.
- 61FEATURES Fulfilling the promise of analytics By Chris Mazzei Strategy, leadership and consumption: The keys to getting the most from big data and analytics focus on the human element. How to get the most out of data lakes By Sean Martin A handful of requisite business skills that facilitate self-service…
- 61FEATURES Welcome to ‘worksocial’ world By Samir Gulati New approach, technology blends data, process and collaboration for better, faster decision-making. How to pick a business partner By David Zakkam and Deepinder Singh Dhingra Ten things to consider when evaluating analytics and decision sciences partners. Big data, analytics and elections By…
- 58July/August 2014 The story of how IBM not only survived but thrived by realizing business value from big data. By (l-r) Brenda Dietrich, Emily Plachy and Maureen Norton This is the story of how an iconic company founded more than a century ago, and once deemed a “dinosaur” that would…
- 58Organizations of all sizes and types are awash in data possibilities, yet most of them cannot capitalize on the potential for a variety of reasons. The good news, however, is that with the right decisions and focus, these possibilities can turn quickly into realized opportunities.