Turn big data into information with high-performance analytics
By Paul Kent (LEFT TO RIGHT), Radhika Kulkarni and Udo Sglavo
We’re in the era of big data, but what do we mean by that? In our view, big data is a relative, not absolute, term. It means that the organization’s need to handle, store and analyze data (its volume, variety, velocity, variability and complexity) exceeds its current capacity and has moved beyond the IT comfort zone . Big data is the classic dual-edged sword – both potential asset and possible curse. Most agree that there is significant, meaningful, proprietary value in that data. But few organizations relish the costs and challenges of simply collecting, storing and transferring that massive amount of data. And even fewer know how to tap into that value, to turn the data into information.
Is the enterprise IT department merely an episode of TV’s “Hoarders” waiting to happen – or will we actually find ways to locate the information of strategic value that is getting buried deeper and deeper in our mountains of data? Quite simply: What are we going to do with all of this data?
At its essence, high-performance analytics (HPA) offers a simple, but powerful, promise: Regardless of how you store data or how much of it there is, complex analytical procedures can still access that data, build powerful analytical models using that data, and provide answers quickly and accurately by using the full potential of the resources in your computing environment.
With high-performance analytics, we are no longer primarily concerned with where the data resides. Today, our ability to compute has far outstripped our ability to move massive amounts of data from disk to disk. Instead, we use a divide-and-conquer approach to cleverly send the processing out to where the data lives.
Ultimately, HPA is about the value of speed and its effect on business behavior. If the analytic infrastructure requires a day to deliver a single computational result, you’re likely to simply accept the answer it provides. But if you can use HPA to get an answer in one minute, your behavior changes. You ask more questions. You explore more alternatives. You run more scenarios. And you pursue better outcomes.
But how do we bring the power of high-performance analytics to data volumes of this scale? We believe there are three basic pillars – three innovative approaches – to bring HPA to dig data:
- Grid computing: distribute the workload among several computing engines. Grid computing enables analysts to automatically use a centrally managed grid infrastructure that provides workload balancing, high availability and parallel processing for business analytics jobs and processes. With grid computing, it is easier and more cost-effective to accommodate compute-intensive applications and growing numbers of users appropriately across available hardware resources and ensure continuous high availability for business analytics applications. You can create a managed, shared environment to process large volumes of programs in an efficient manner.
- In-database analytics: move the analytics process closer to the data. With in-database processing, analytic functions are executed within database engines using native database code. Traditional programming may include copying data to a secondary location, and the data is processed using the programming language outside the database. Benefits of in-database processing include reduced data movement, faster run-times, and the ability to leverage existing data warehousing investments.
- In-memory analytics: distribute the workload and data alongside the database. In this approach, big data and intricate analytical computations are processed in-memory and distributed across a dedicated set of nodes to produce highly accurate insights to solve complex problems in near-real time. This is about applying high-end analytical techniques to solve these problems within the in-memory environment. For optimal performance, data is pulled and placed within the memory of a dedicated database appliance for analytic processing.
Keys to HPA Success
What does it take to succeed with high-performance analytics? HPA isn’t simply an incremental discipline. It involves innovative shifts in how we approach analytic problems. We view them differently and continue to find new ways to solve them. It’s more than simply taking a serial algorithm and breaking it into chunks. Success requires deeper, broader algorithms in multiple disciplines and the ability to rethink our business processes.
In our experience, HPA solutions to complex business problems require innovation along two different dimensions. First, algorithms and modeling techniques must be invented and built to exploit the power of massively parallel computational environments in three major areas:
- Descriptive analytics. You can report and generate descriptive statistics of historical performance that help you see what has transpired far more clearly than ever before.
- Predictive analytics. You can use data relationships to model, predict and forecast business results in impressive ways and predict future events and outcomes.
- Prescriptive analytics. You can identify the relationships among variables to develop optimized recommendations that take advantage of your predictions and forecasts and foresee the likely implications of each decision option .
Second, HPA tools and products must be built, incorporating these high-performance analytics techniques, to enable the following:
- visualization and exploration of massive volumes of data;
- creation of analytical models that use multi-disciplinary approaches such as statistics, data mining, forecasting, text analytics and optimization; and
- application of domain-specific solutions to complex problems that incorporate both specific analytical techniques as well as the business processes to support decision-making.
What makes HPA so compelling to businesses across the spectrum – and makes them willing to undertake this fundamental rethinking of analytics – is the ability to address and resolve transformational business problems that have the potential to fundamentally change the nature of the business itself. By processing billions of observations and thousands of variables in near-real time, HPA is unleashing power and capabilities that are without precedent. Your business could witness the same results, for example, by taking the following steps:
- implementing a data mining tool that creates predictive and descriptive models on enormous data volumes;
- using those variables to predict which customers might abandon an online application and offer them incentives to continue their session; and
- comparing these incentives against one another and the budget, in real time, to identify the best offer for each customer.
That’s the kind of emphatic value that HPA can provide and why it’s continuing to garner the attention of many enterprises today.
Amazingly, the discipline of high-performance analytics continues to move forward at a rapid pace. As storage gets even more affordable and greater amounts of processing power become ever-cheaper, it’s easy for us to envision “analytical streaming” in real time where insights are not discrete events but are part of the minute-by-minute operation of the enterprise, woven into the fabric of every meaningful business process. Moving further down the cost curve will enable us to further democratize analytics and move it beyond the specialized analyst and into the hands of virtually every employee, increasing the breadth and depth of the value. By pushing out the power of this style of HPA, we have the opportunity to achieve exponentially outsized gains driven by new levels of rapid analysis.
Paul Kent is the vice president of Big Data at SAS. Radhika Kulkarni is vice president of Advanced Analytics R&D at SAS and a senior member of INFORMS. Udo Sglavo is principal analytical consultant at SAS. This article was excerpted and adapted from the chapter “Finding Big Value in Big Data: Unlocking the Power of High-Performance Analytics” by the authors in “Big Data and Business Analytics,” edited by Jay Liebowitz, ©2013 Taylor & Francis Group LLC. Reprinted with permission.
- For more information, visit: http://www.sas.com/big-data/index.html
- For more information, visit: http://www.informs.org/Community/Analytics
- 83July/August 2014 Key considerations for deep analytics on big data, learning and insights. By (l-r) Haluk Demirkan and Bulent Dal What is big data? Big data, which means many things to many people, is not a new technological fad. In addition to providing innovative solutions and operational insights to enduring…
- 74November/December 2013 By Gary Cokins Allow me to take a contrarian view to the rapid interest in applying analytics. Have we seen this kind of story before? Is there a rush to judgment that analytics is the elusive cure-all magic potion or panacea that management has been seeking to achieve…
- 64July/August 2013 Four things to look for when building an analytics team. By Andrew Jennings Companies in every industry from retail to banking are leveraging big data to improve the customer experience and enhance their bottom lines. Big data – high volume, high velocity (real time) and high variety (structured…
- 49July/August 2014 The story of how IBM not only survived but thrived by realizing business value from big data. By (l-r) Brenda Dietrich, Emily Plachy and Maureen Norton This is the story of how an iconic company founded more than a century ago, and once deemed a “dinosaur” that would…
- 47A quick quiz: What is a good nine- or 10-letter description of the emerging interest in business analytics and big data that ends in “-al”? A choice that may come to mind for many is “hysterical.” This choice reflects frenzied excitement about opportunities for business analytics to solve problems often…