Share with your friends


Analytics Magazine

Analytical Journey: Navigating the big data analytics SaaS terrain

July/August 2016

business analytics news and articles

Focus on data, not infrastructure: Three things to look for, three things to avoid.

Brad KolarovBy Brad Kolarov

With the continued hype from so many big data companies, it is hard to understand the best way to start down the big data analytics path. After all, the reason we use big data software tools is to improve our data analytics, not to see if we can get the latest and greatest big data tool to work. We have seen too many “square-peg” solutions pounded into “round-hole” problems.

This article should educate those data consumers focused on solving analytic challenges, those who have yet to start down the path of big data analytics, those who are stuck in the middle of that journey – or the ones who have made it through, but are ready to move to more advanced analytic frameworks.

Navigating the Traditional Terrain

Traditional big data service offerings typically cover a single capability in a range of business needs. Some companies simply make it easier to spin up a Hadoop cluster. Others offer proprietary algorithms to track or uncover patterns in data. Still others provide an aggregation platform for these services and more, all under one roof.

Regardless of your need, one of these types of Software as a Service (SaaS) offerings can help your business get started when it comes to standing up an enterprise-level, cloud-based big data analytics capability, but they consistently fall short in solving your analytic needs.

The path toward big data analytics can take many twists and turns.

The path toward big data analytics can take many twists and turns. Photo Courtesy of | Dirk Ercken

These kinds of distributed processing systems are notoriously hard to system-engineer. They require continual interaction between the IT department, software developers and internal end-user data analysts. These systems could easily add weeks or months to the time it takes for developers to gain access to a Hadoop cluster. (And the larger the cluster, the longer it may take to get from IT to the developers.)

The next generation of big data analytics tools automates these hard-to-system-engineer steps. Through automation, developers can gain access to a Hadoop cluster almost immediately, as opposed to the unwieldy lengths of time it might take through conventional channels.

Next Generation of Big Data Analytics

These new SaaS services have made automating processes almost push-button easy, allowing quintessential infrastructure or analytics models to be built in a self-service environment.

Developers can now go to a website that provides a click-through portal for access to resources they need, based on customized patterns they define. With a few clicks of a mouse, they can have a dedicated space in their cloud, and one or many big data stacks provisioned for them. A few clicks more and they can automatically ingest data and information into the data stacks they’ve created, all with the confidence of cloud-based security to protect sensitive enterprise data.

This way of working clearly facilitates better, faster interaction between developers and IT. That improved interaction in turn makes it easier and faster for data analysts to ingest data and begin gaining critical business insights.

Better still, automation offers a level of resilience and creates more robust big data systems, which are traditionally the most fragile part of an IT environment.

Of course, not all of these new SaaS products are created equal, and it is very difficult to cut through the marketing façade. Users need to be sure that they’ve chosen the right one for their purposes. Below are a few things to keep in mind and a few to avoid when deciding on which platform is right for you.

Three things to look for in big data SaaS:

1. A platform with comprehensive offerings. Most companies need more than just Hadoop and Spark, even if the system can spin up these services in minutes. You should find a SaaS provider that gives you a choice of a broad range of tools with different functions (Kafka, Elasticsearch, Zeppelin, etc.) – but doesn’t make you use them all. This will allow you to fully customize the way your company interacts with data, without having to take on a full load of unnecessary tools. The more options, the more you can do with your data, which is, after all, the point.

2. Systems that automatically ingest data. Provisioning clusters is a relatively easy process and not particularly new to the industry – especially for applications like Hadoop. Once you’ve spun up that cluster, though, you need an equally effective system to bring in your enterprise data and start doing the real analytics work.

3. Transparency in security. The data clusters you build should be in your own environment. Consequently, your infrastructure should be securely hosted in your cloud accounts where you have full access and full accountability for your infrastructure and data.

Three things to avoid:

1. Proprietary, “black box” software. One key advantage to open source is the vast choice and transparency involved in analyzing your data. Some companies may require you to download a full suite of open source or proprietary software to work with your enterprise data. If that works for your enterprise, great. Most companies, however, find that this approach undermines the entire reason of working with open source software in the first place. In general, it’s better to find a provider that allows you to sidestep proprietary software or distributions and launch right away in your cloud.

2. Big data solutions that consume your data. Data is the most sensitive part of the equation when it comes to using a SaaS system with confidence. Make sure that your provider does not consume or escrow your data. And avoid any solutions that may host your data in their own cloud.

3. Bleeding edge. Know the difference between cutting edge and bleeding edge. Some online applications are simply not ready yet for the enterprise, so building a Hadoop cluster on one of these systems may cause more problems than it solves, despite the cool factor of saying you use the technology. Make sure the provider you choose gives you access to open source tools that are widely adopted and well understood by enterprise customers from a security, performance and cost perspective.

Automation holds the key to fast development of enterprise big data capability, and today’s SaaS offerings have many levels of automation. Make sure you pick the system that’s right for your current needs – and can grow with your enterprise as those needs change.

Brad Kolarov is managing partner of Stackspace, a big data technology company that simplifies data analysis for faster business decisions. He can be reached at

business analytics news and articles





Study: The magic of animated movies not tied to latest technology

In the nearly 60 years between the 1939 release of Hollywood’s first full-length animated movie, “Snow White and the Seven Dwarfs” and modern hits like “Toy Story,” “Shrek” and more, advances in animation technology have revolutionized not only animation techniques, but moviemaking as a whole. However, a new study in the INFORMS journal Organization Science found that employing the latest technology doesn’t always ensure creative success for a film. Read more →

Six finalists named for Edelman Award

INFORMS selected a diverse group of six finalists for the 47th annual Franz Edelman Award for Achievements in Operations Research and Management Science, the world’s most prestigious award for achievement in the practice of analytics and O.R. The 2018 finalists, who will present their work before a panel of judges at the INFORMS Conference on Analytics & Operations Research in Baltimore on April 15-17, included innovative applications in broadcasting, healthcare, communication, inventory management, vehicle fleet management and alternative energy. Read more →

Are Super Bowl ads worth it? New research suggests benefits persist

On Feb. 4, more than 40 percent of U.S. households will watch the 2018 Super Bowl game on TV. Advertisers will pay up to $4 million for a 30-second spot during the telecast. Is the high cost of advertising worth it? A new study finds that the benefits from Super Bowl ads persist well into the year with increased sales during other sporting events. Further, the research finds that the gains in sales are much more substantial when the advertiser is the sole advertiser from its market category or niche in a particular event. Read more →



2018 INFORMS Conference on Business Analytics and Operations Research
April 15-17, 2018, Baltimore


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to