Share with your friends


Analytics Magazine

The rise of self-service analytics

Effectively integrating disparate data from different systems and devices requires a complete understanding of the organization’s “data map.” Photo Courtesy of | mindscanner

Effectively integrating disparate data from different systems and devices requires a complete understanding of the organization’s “data map.”
Photo Courtesy of | mindscanner

As SSA gains momentum, the need for data governance increases in order to drive true business value.

Paul BrunetBy Paul Brunet

With the rise of big data – and the processes and tools related to utilizing and managing large data sets – organizations are recognizing the value of data as a critical business asset to identify trends, patterns and preferences to drive improved customer experiences and competitive advantage. The problem is, users too often can’t find the data they need to perform desired analytics. Data tends to be buried in different systems or siloed in departments across the organization.

This data chaos and uncertainty is costing businesses big money – as much as $3.1 trillion, according to a recent Harvard Business Review study. In addition to the time wasted on searching for the data, individual interpretation of the data through a subjective lens can result in inconsistencies that adversely affect a company’s business.

Making Trust a Priority for Reliable Analytics

The industry has seen a surge of self-service analytics (SSA) tools such as Tableau and Qlik that enable analysts and non-technical business users to gain insights and drive data-focused initiatives. SSA and BI empowers knowledge workers and business users to gather desired insights without reliance on IT to run reports.

However, investing in analytics tools alone can’t deliver business value. For a SSA tool to do its job, companies need to ensure that the people using the tool can easily access the data they need across the organization – including siloed data living in various systems – and have full confidence in this data to apply it for greater business insight and results. Effectively integrating disparate data from different systems and devices requires a complete understanding of the organization’s “data map” and the data’s journey and relationship to other similar – or sometimes contradictory – data throughout the organization. This is best achieved through data governance.

Data governance offers a collaborative framework for managing and defining enterprise-wide policies, business rules and assets to provide all business users with high-quality data that is easily accessible in a consistent manner. By adopting an overall policy through governance, users can determine data inventory, data ownership, critical data elements (CDE), data quality, information security, data lineage and data retention so they have a good understanding of the data across the organization and its meaning. True data governance breaks down data silos so users can find the trusted data they need, collaborate on it, and easily understand it so it’s consumable to drive competitive advantage. This is the new order of data governance today.

Ensuring data trust has become one of the most critical factors to driving successful BI initiatives. When users know they can trust the data, they are more likely to use it for business insight. And this element of trust becomes even more critical when we look at automated analysis through machine learning, a growing trend that offers great business value. The ability to sift through large volumes of data and draw conclusions can move a business forward. But in a business world where the volume of data has become increasingly massive, it’s impossible to manage this without automation.

Empowering Data Discovery for Greater Insight

Data governance is critical to the success of self-service BI models by providing consistent and reliable data across the organization in a unified form that algorithms can understand. Additionally, business users need to know how to best explore this data without relying on IT’s hand-holding for optimal discovery and insight.

Initial user training of analytics tools is essential before getting started on any project. But even for those who possess a good understanding of how to use these tools, agreeing upfront on definitions and KPIs of BI models is essential, as is knowing whether certain reports already exist before duplicating efforts. This is another common bottleneck to leveraging SSA tools’ full potential – research shows 70 percent of a data analyst’s time is spent on preparing and analyzing data on questions already answered. The more visibility and understanding users have of the data, the more informed decisions they can make regarding which models to explore.

Consumerizing Data Discovery and Analytics via a Data Catalog

Deploying a data catalog as part of the data governance solution provides business users with a more strategic and simplified data overview. A data catalog organizes useful collections of data across existing boundaries. Whether those boundaries are systems, organizations or geographies, this cross-boundary visibility drives many of data’s more significant insights. This broader understanding of data through governance and a catalog empowers the user with a consumable data experience for capitalizing on analytics. Users can harness the expertise of data citizens around the organization, too, and use the catalog infrastructure to enable them to share their work. This gives the user a clear idea of which reports already exist and results in more effective BI analytics and reporting.

self-service analytics

Figure 1

Through a catalog, business users can easily find (or shop) for the trusted data they need from one central location, just as they do on consumer sites such as Amazon. A catalog automatically links business terms from the business glossary to registered tables and columns, and leverages the organization’s agreed-upon vocabulary, providing users with a better understanding of the data’s context. This helps users determine whether that data is a good fit for the analysis in question. And different views of the data provide different aspects, which feeds into the analysis.

If the data for which users are searching is not properly cataloged, the self-service tools will not yield valuable results. A data catalog incorporated with machine learning delivers even greater insight and makes recommendations based on past user behaviors and “data purchases,” much like Amazon does for frequent shoppers. A catalog makes it easier and quicker for users to find the data for decision-making, but also enables users to define models earlier in the process. This is particularly helpful for making changes on the fly, as in the case with last-minute requests in definitions or KPIs.

To gain full value of SSA tools, the data catalog should support five capabilities:

1. Consumability. Choosing a data catalog and self-service analytics tools with a user-friendly interface is important to business users who may not be tech savvy. Simple drag-and-drop functionality, intuitive mapping and navigation, and easy-to-read help sections are imperative. The data catalog should go beyond structural and usage metadata and provide an easy “data shopping” experience with the complete meaning, lineage and relationships of consistent and trusted governed data for the business user to capitalize on analytics for business advantage.

2. Business modeling. A catalog with out-of-the-box operating models complements self-service analytics tools by providing a flexible structure for consumable information about any type of data. This functionality then links to the data sources, business applications, data lakes, data quality systems, and all sources of metadata to create a responsive system – essentially aligning the data to the business. These connections enable changes to be detected and policies immediately applied, without manual steps, driving active data governance. The operating model feature enables the business user to create analytics models with specific definitions and KPIs, and easily search for data while having a full view and understanding of the data, how it differs and where it comes from, and trust that information because the data is linked to the data governance platform.

For example, when running analytics reports on customer loyalty, the data collected from website interaction and that of financial transactions on the backend can offer vital clues to customer buying behaviors, even when they have different meanings. This knowledge and understanding of the different but trusted data and meanings promotes data consumability and drives accurate insights and predictions.

3. Collaboration. After users have created their analytics models and prepared their data, they need the ability to explore the data in a way that suits their needs and objectives. First, they need to ensure algorithms are applied correctly and business rules are added, but also critical to the tools’ interaction is collaboration. A data catalog integrates the work of colleagues who may be looking for similar analyses, and can point the user to data sets they may have already created. This simplifies the process of finding relevant and comparable data and saves time if similar reports have already been developed. Because users can tag, document, and annotate data sets in the data catalog, the data is continuously enriched, increasing its value, eliminating data silos and encouraging collaboration and crowdsourcing.

4. Trust. Because policies and rules are applied to the data through governance and because data owners have been assigned and changes/updates are made consistently, users can feel confident knowing the data they’re using (including shared data) for analytics is data they are allowed to use, and/or has been previously approved, and falls under data privacy regulations. This is particularly essential when considering today’s increased regulatory environment and the nature of analytics, pulling data insights from different sources (including customer information) to apply them in external efforts.

5. Machine learning. By incorporating machine learning functionality via semantic search capabilities, the catalog can serve up increasingly relevant data to users over time and offer an automated and efficient way to improve data searches to be used in analysis. This is the Amazon-like feature mentioned earlier, “consumerizing” the user’s data discovery experience and analytical capabilities.

As organizations adopt more self-service tools for BI and expand their analytical capabilities, leveraging a data catalog with these capabilities tied to data governance will give them confidence in knowing their business insights are based on trusted data. This is when we’ll start to see the true value of SSA tools in helping to drive business forward.

Paul Brunet is vice president of product marketing at Collibra.

business analytics news and articles

Related Posts

  • 100
    Frontline Systems releases Analytic Solver V2018 for Excel Frontline Systems, developer of the Solver in Microsoft Excel, recently released Analytic Solver V2018, its full product line of predictive and prescriptive analytics tools that work in Microsoft Excel. The new release includes a visual editor for multi-stage “data science workflows” (also…
    Tags: data
  • 100
    Businesses are greatly expanding the autonomous capabilities of their products, services and manufacturing processes to better optimize their reliability and efficiency. The processing of big data is playing an integral role in developing these prescriptive analytics. As a result, data scientists and engineers should pay attention to the following aspects…
    Tags: data
  • 100
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 100
    Today, we live in a digital society. Our distinct footprints are in every interaction we make. Data generation is a default – be it from enterprise operational systems, logs from web servers, other applications, social interactions and transactions, research initiatives and connected things (Internet of Things). In fact, according to…
    Tags: data
  • 91
    Thousands of companies all over the world are competing for a finite number of data scientists, paying them big bucks to join their organizations – and setting them up for failure.
    Tags: data


Former INFORMS President Cook named to U.S. Census committee

Tom Cook, a former president of INFORMS, a founding partner of Decision Analytics International and a member of the National Academy of Engineering, was recently named one of five new members of the U.S. Census Bureau’s Census Scientific Advisory Committee (CSAC). The committee meets twice a year to address policy, research and technical issues relating to a full range of Census Bureau programs and activities, including census tests, policies and operations. The CSAC will meet for its fall 2018 meeting at Census Bureau headquarters in Suitland, Md., Sept. 13-14. Read more →

Gartner identifies six barriers to becoming a digital business

As organizations continue to embrace digital transformation, they are finding that digital business is not as simple as buying the latest technology – it requires significant changes to culture and systems. A recent Gartner, Inc. survey found that only a small number of organizations have been able to successfully scale their digital initiatives beyond the experimentation and piloting stages. “The reality is that digital business demands different skills, working practices, organizational models and even cultures,” says Marcus Blosch, research vice president at Gartner. Read more →

Innovation and speculation drive stock market bubble activity

A group of data scientists conducted an in-depth analysis of major innovations and stock market bubbles from 1825 through 2000 and came away with novel takeaways of their own as they found some very distinctive patterns in the occurrence of bubbles over 175 years. The study authors detected bubbles in approximately 73 percent of the innovations they studied, revealing the close relationship between innovation and stock market bubbles. Read more →



INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden


Applied AI & Machine Learning | Comprehensive
Sept. 10-13, 17-20 and 24-25

Advancing the Analytics-Driven Organization
Sept. 17-20, 12-5 p.m. LIVE Online

The Analytics Clinic: Ensemble Models: Worth the Gains?
Sept. 20, 11 a.m.-12:30 p.m.

Predictive Analytics: Failure to Launch Webinar
Oct. 3, 11 a.m.

Advancing the Analytics-Driven Organization
Oct. 1-4, 12 p.m.-5 p.m.

Applied AI & Machine Learning | Comprehensive
Oct. 15-19, Washington, D.C.

Making Data Science Pay
Oct. 29 -30, 12 p.m.-5 p.m.


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to