Executive Edge: Overcoming big data challenges for analytics
By Kerem Tomak
It’s been more than a decade since the Internet became a household shopping front. We shop without leaving the sofa during a commercial break due to the ease of a tablet device. Our smartphone tells us how much an item is on a competitive ecommerce site while we are shopping in a retail store. If we like a product we buy it instantly without waiting in a checkout line.
One common theme behind all these activities: we implicitly or explicitly create data as we interact with these devices. We transmit data to the “cloud” where it is stored. This data (with our permission) then becomes part of an analytic workflow somewhere and comes back to us with recommendations and/or offers on what we should buy next, and the circle of commerce continues.
Twenty years ago, 30MB of hard disk was so immense that one didn’t know what to do with so much storage space. A gigabyte was “big data” for an 8086 processor and DOS-based Lotus 123 worksheets that were used. The Internet did not exist, so the speed at which data increased was contingent upon the speed at which one could receive floppy disks in the mail, 360KB at a time.
However, we still had the same workflow that we have today in relation to analytic exercise. We sampled, ran descriptive statistics and visualized the data. Based on our findings, we came up with a model or series of models that best fit the data, calibrated the model parameters based on simulations and completed the “version 0” of the analytics deliverable. As we collected new data, we would revisit the process and assess whether we needed a new model or keep the existing one, making a few parametric changes here and there. All the data we had filled a spreadsheet back then. We could eyeball the data and see patterns easily.
Similarly, when we sample data today, we need efficient and fast visualization tools that allow us to get to the “nuggets” quickly. Not only is the data much larger, but the dimensions over which the data is collected are numerous. The belief that since we have more data we do not need to sample is a flawed one. A critical assumption behind that thought is that big data is accurately and comprehensively capturing every known piece of information there is to know about everything. Within the modeling realm there is also the concept of over-fitting, data quality, etc., which still implies sampling as a step in the analytic process. However, a 1 percent sample of a 100TB data is still large data.
Rising Customer Expectations
As the time spans in which data is created are compressed, customer expectations of companies to provide information about products and services such as availability, delivery, discounts in near real time, if not real time, increase dramatically. To complicate things even further, there is a new addition to the data types that has added a twist to the story: social media feeds. Semi- or un-structured data makes parsing, analyzing and interpreting the data even more challenging, as the data does not come in traditional columnar setup. What is the value of a fan’s comment on a business’s Facebook page? Who are the social influencers in a company’s network of fans and how can we use this information to reach to the right audience? How can a company understand which products are trendy or what brands are in high demand from tweets? After pre-processing and massaging the social data, these and similar questions can be answered by using statistical tools and experimenting with findings to see if any of those are actionable.
Thanks to the cloud, we do not need to invest a lot of money in hardware and software to process all this data. Our ability of disseminating information quickly across different units is constrained by the slowest link we maintain in our network. If we are not comfortable with moving and/or sharing a lot of data, we can build our own cloud behind firewalls. Sophisticated statistical and visualization software are affordable as well. It can be only a matter of days before a company obtains more than simple analytical capabilities. Enterprise class operations still require significant investment, but even these are relatively cheap.
These affordable technological capabilities enable the possibility of building a successful analytics function as if the unit is a startup company within a larger organization. This is one of the many scenarios in which an analytics team can be established. With buy-in from senior management already achieved and seed funding ready, the main starting point is to hire an experienced analytics leader and empower him or her to build the roadmap to establish a proactive team.
Analytics leaders need to speak the language of at least one quantitative field such as mathematics, statistics, operations research or economics. This is necessary to build a credible leadership vertically and across the organization. Think of them as interpreters between the quantitative types and execution teams. An efficient analytics leader needs to understand the business and trends, anticipate the changes in requests for information and plan ahead to build required capacity to respond to the changes. Many analytics projects fail as either the information is too overwhelming or the model is too complex for a non-quantitative end-user to comprehend and take an action.
One of the key early steps is to have a dedicated systems team that is given the right funding and flexibility to build the analytics systems and support. Without a clear roadmap toward scalable and robust systems and processes, an analytics team is limited in capabilities. Analytics leadership needs to pass requirements to the systems team or teams in order to put the building blocks in place. This requires a comprehensive understanding, exposure and hands-on experience with data and analytics systems and tools.
What does this flexibility enable an analytics team to accomplish? They can rapidly prototype automated, data-driven solutions in reporting, product recommendations, personalized offers and more. Being on the cutting edge of tools and techniques enables the right data scientist to have the freedom to invent. Business units benefit from not only improved internal processes to acquire the information they need much faster, but they also start to find novel ways to serve their customers, to improve their product offerings, and to understand where the bottlenecks are within the organization, and the list grows.
Testing and Production of Prototypes
Finally, the path to testing and production of working prototypes needs to be smooth and supported by technology teams across different business units. An analytics team needs to be able to build dashboards and disseminate the information through centralized systems for everyone who needs that information to use. They need to be able to test new algorithms live or by using simulations to see what needs to be tweaked and/or improved. But most importantly they need to work hand in hand with agile technology teams to turn prototypes into products that pass strict SLAs and requirements to meet the performance criteria of the production systems.
The road to taming big data passes through people who are trained to handle the intricacies of data, understand their business, articulate what they see and, most importantly, are enabled to feed their intellectual curiosity by learning new tools and thinking outside the box. Aligned with testing and delivery teams, an analytics team with a keen focus on the end-goal can be a major driver of a successful business.
Kerem Tomak (email@example.com) is vice president of Marketing Analytics at Macys.com. He is a member of INFORMS.
- 38Use of the term “business analytics” is being used within the information technology industry to refer to the use of computing to gain insight from data. The data may be obtained from a company’s internal sources, such as its enterprise resource planning application, data warehouses/marts, from a third party data…
- 38FEATURES Fulfilling the promise of analytics By Chris Mazzei Strategy, leadership and consumption: The keys to getting the most from big data and analytics focus on the human element. How to get the most out of data lakes By Sean Martin A handful of requisite business skills that facilitate self-service…
- 37Frontline Systems, developer of the Solver in desktop Microsoft Excel 26 years ago, announced that it has surpassed 200,000 users of its cloud-based advanced analytics tools for optimization, simulation/risk analysis, forecasting, data mining and text mining – based on usage data from Microsoft, Google and its own SaaS platforms.
- 35November/December 2014 Big data needs advanced analytics, but analytics does not need big data. By Eric A. King Thanks big data! Now we’re even more data-rich … yet remain information-poor. After staggering investments motivated by an overabundance of buzz and hype, big data has yet to produce cases that reveal…
- 34Benjamin Franklin offered this sage advice in the 18th century, but he left one key question unanswered: How? How do you successfully drive a business? More specifically, how do you develop the business strategy drivers that incite a business to grow and thrive? The 21st-century solution has proven to be…