Share with your friends


Analytics Magazine

Executive Edge: Overcoming big data challenges for analytics

November/December 2012

Kerem TomakBy Kerem Tomak

It’s been more than a decade since the Internet became a household shopping front. We shop without leaving the sofa during a commercial break due to the ease of a tablet device. Our smartphone tells us how much an item is on a competitive ecommerce site while we are shopping in a retail store. If we like a product we buy it instantly without waiting in a checkout line.

One common theme behind all these activities: we implicitly or explicitly create data as we interact with these devices. We transmit data to the “cloud” where it is stored. This data (with our permission) then becomes part of an analytic workflow somewhere and comes back to us with recommendations and/or offers on what we should buy next, and the circle of commerce continues.

Twenty years ago, 30MB of hard disk was so immense that one didn’t know what to do with so much storage space. A gigabyte was “big data” for an 8086 processor and DOS-based Lotus 123 worksheets that were used. The Internet did not exist, so the speed at which data increased was contingent upon the speed at which one could receive floppy disks in the mail, 360KB at a time.

However, we still had the same workflow that we have today in relation to analytic exercise. We sampled, ran descriptive statistics and visualized the data. Based on our findings, we came up with a model or series of models that best fit the data, calibrated the model parameters based on simulations and completed the “version 0” of the analytics deliverable. As we collected new data, we would revisit the process and assess whether we needed a new model or keep the existing one, making a few parametric changes here and there. All the data we had filled a spreadsheet back then. We could eyeball the data and see patterns easily.

Similarly, when we sample data today, we need efficient and fast visualization tools that allow us to get to the “nuggets” quickly. Not only is the data much larger, but the dimensions over which the data is collected are numerous. The belief that since we have more data we do not need to sample is a flawed one. A critical assumption behind that thought is that big data is accurately and comprehensively capturing every known piece of information there is to know about everything. Within the modeling realm there is also the concept of over-fitting, data quality, etc., which still implies sampling as a step in the analytic process. However, a 1 percent sample of a 100TB data is still large data.

Rising Customer Expectations

As the time spans in which data is created are compressed, customer expectations of companies to provide information about products and services such as availability, delivery, discounts in near real time, if not real time, increase dramatically. To complicate things even further, there is a new addition to the data types that has added a twist to the story: social media feeds. Semi- or un-structured data makes parsing, analyzing and interpreting the data even more challenging, as the data does not come in traditional columnar setup. What is the value of a fan’s comment on a business’s Facebook page? Who are the social influencers in a company’s network of fans and how can we use this information to reach to the right audience? How can a company understand which products are trendy or what brands are in high demand from tweets? After pre-processing and massaging the social data, these and similar questions can be answered by using statistical tools and experimenting with findings to see if any of those are actionable.

Thanks to the cloud, we do not need to invest a lot of money in hardware and software to process all this data. Our ability of disseminating information quickly across different units is constrained by the slowest link we maintain in our network. If we are not comfortable with moving and/or sharing a lot of data, we can build our own cloud behind firewalls. Sophisticated statistical and visualization software are affordable as well. It can be only a matter of days before a company obtains more than simple analytical capabilities. Enterprise class operations still require significant investment, but even these are relatively cheap.

These affordable technological capabilities enable the possibility of building a successful analytics function as if the unit is a startup company within a larger organization. This is one of the many scenarios in which an analytics team can be established. With buy-in from senior management already achieved and seed funding ready, the main starting point is to hire an experienced analytics leader and empower him or her to build the roadmap to establish a proactive team.

Analytics Leadership

Analytics leaders need to speak the language of at least one quantitative field such as mathematics, statistics, operations research or economics. This is necessary to build a credible leadership vertically and across the organization. Think of them as interpreters between the quantitative types and execution teams. An efficient analytics leader needs to understand the business and trends, anticipate the changes in requests for information and plan ahead to build required capacity to respond to the changes. Many analytics projects fail as either the information is too overwhelming or the model is too complex for a non-quantitative end-user to comprehend and take an action.

One of the key early steps is to have a dedicated systems team that is given the right funding and flexibility to build the analytics systems and support. Without a clear roadmap toward scalable and robust systems and processes, an analytics team is limited in capabilities. Analytics leadership needs to pass requirements to the systems team or teams in order to put the building blocks in place. This requires a comprehensive understanding, exposure and hands-on experience with data and analytics systems and tools.

What does this flexibility enable an analytics team to accomplish? They can rapidly prototype automated, data-driven solutions in reporting, product recommendations, personalized offers and more. Being on the cutting edge of tools and techniques enables the right data scientist to have the freedom to invent. Business units benefit from not only improved internal processes to acquire the information they need much faster, but they also start to find novel ways to serve their customers, to improve their product offerings, and to understand where the bottlenecks are within the organization, and the list grows.

Testing and Production of Prototypes

Finally, the path to testing and production of working prototypes needs to be smooth and supported by technology teams across different business units. An analytics team needs to be able to build dashboards and disseminate the information through centralized systems for everyone who needs that information to use. They need to be able to test new algorithms live or by using simulations to see what needs to be tweaked and/or improved. But most importantly they need to work hand in hand with agile technology teams to turn prototypes into products that pass strict SLAs and requirements to meet the performance criteria of the production systems.

The road to taming big data passes through people who are trained to handle the intricacies of data, understand their business, articulate what they see and, most importantly, are enabled to feed their intellectual curiosity by learning new tools and thinking outside the box. Aligned with testing and delivery teams, an analytics team with a keen focus on the end-goal can be a major driver of a successful business.

Kerem Tomak ( is vice president of Marketing Analytics at He is a member of INFORMS.

business analytics news and articles


Related Posts

  • 38
    FEATURES Fulfilling the promise of analytics By Chris Mazzei Strategy, leadership and consumption: The keys to getting the most from big data and analytics focus on the human element. How to get the most out of data lakes By Sean Martin A handful of requisite business skills that facilitate self-service…
    Tags: analytics, data, cloud, internet
  • 38
    Features Forum: Anxiety over AI Now is the time to address misunderstandings regarding artificial intelligence to alleviate fears, before it’s too late. By Joseph Byrum The rise of self-service analytics As SSA gains momentum, the need for data governance increases in order to drive true business value going forward. By…
    Tags: analytics, data, cloud, internet
  • 38
    Use of the term “business analytics” is being used within the information technology industry to refer to the use of computing to gain insight from data. The data may be obtained from a company’s internal sources, such as its enterprise resource planning application, data warehouses/marts, from a third party data…
    Tags: analytics, data, descriptive
  • 37
    Frontline Systems, developer of the Solver in desktop Microsoft Excel 26 years ago, announced that it has surpassed 200,000 users of its cloud-based advanced analytics tools for optimization, simulation/risk analysis, forecasting, data mining and text mining – based on usage data from Microsoft, Google and its own SaaS platforms.
    Tags: analytics, data, cloud
  • 35
    November/December 2014 Big data needs advanced analytics, but analytics does not need big data. By Eric A. King Thanks big data! Now we’re even more data-rich … yet remain information-poor. After staggering investments motivated by an overabundance of buzz and hype, big data has yet to produce cases that reveal…
    Tags: data, analytics, analytic


Former INFORMS President Cook named to U.S. Census committee

Tom Cook, a former president of INFORMS, a founding partner of Decision Analytics International and a member of the National Academy of Engineering, was recently named one of five new members of the U.S. Census Bureau’s Census Scientific Advisory Committee (CSAC). The committee meets twice a year to address policy, research and technical issues relating to a full range of Census Bureau programs and activities, including census tests, policies and operations. The CSAC will meet for its fall 2018 meeting at Census Bureau headquarters in Suitland, Md., Sept. 13-14. Read more →

Gartner identifies six barriers to becoming a digital business

As organizations continue to embrace digital transformation, they are finding that digital business is not as simple as buying the latest technology – it requires significant changes to culture and systems. A recent Gartner, Inc. survey found that only a small number of organizations have been able to successfully scale their digital initiatives beyond the experimentation and piloting stages. “The reality is that digital business demands different skills, working practices, organizational models and even cultures,” says Marcus Blosch, research vice president at Gartner. Read more →

Innovation and speculation drive stock market bubble activity

A group of data scientists conducted an in-depth analysis of major innovations and stock market bubbles from 1825 through 2000 and came away with novel takeaways of their own as they found some very distinctive patterns in the occurrence of bubbles over 175 years. The study authors detected bubbles in approximately 73 percent of the innovations they studied, revealing the close relationship between innovation and stock market bubbles. Read more →



INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden


Applied AI & Machine Learning | Comprehensive
Sept. 10-13, 17-20 and 24-25

Advancing the Analytics-Driven Organization
Sept. 17-20, 12-5 p.m. LIVE Online

The Analytics Clinic: Ensemble Models: Worth the Gains?
Sept. 20, 11 a.m.-12:30 p.m.

Predictive Analytics: Failure to Launch Webinar
Oct. 3, 11 a.m.

Advancing the Analytics-Driven Organization
Oct. 1-4, 12 p.m.-5 p.m.

Applied AI & Machine Learning | Comprehensive
Oct. 15-19, Washington, D.C.

Making Data Science Pay
Oct. 29 -30, 12 p.m.-5 p.m.


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to