Share with your friends


Analytics Magazine

Dark data: The two sides of the same coin

Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk. Photo Courtesy of | © aleksanderdn

Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk.
Photo Courtesy of | © aleksanderdn

Ganesh MoorthyBy Ganesh Moorthy

Today, we live in a digital society. Our distinct footprints are in every interaction we make. Data generation is a default – be it from enterprise operational systems, logs from web servers, other applications, social interactions and transactions, research initiatives and connected things (Internet of Things). In fact, according to a Digital Universe study, 2.2 zettabytes of data was generated in 2012. This grew by 100 percent in 2013, and is slated to grow to 44 zettabytes by 2020 worldwide.

The study further states that only 0.5 percent of the data generated is actually being analyzed. The study goes on to estimate that about 25 percent of the data, if properly managed, tagged and categorized, can be consumed for other purposes.

Enterprises have been collecting and storing data since the age of computers; dark data has always existed. But its close correlation to big data has made it a buzzword (or buzzkill, depending on your point of view) in current times. The challenge, though, is that we are simply not equipped to deal with this constant deluge of data. Compounding this effect is the fact that most of this unanalyzed data is unstructured. It takes more pre-processing and transformation efforts to make data ready for analytical consumption. So then, how do we manage dark data?

Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk. Based on an organization’s intent and investment appetite, dark data can either be tapped to generate more opportunities or remain in the dark, forever – the two sides of the same coin. We cannot, however, manage it like a coin toss, with a 50 percent probability of achieving heads or tails. Four best practices to keep in mind:

1. Make it a conscious investment. Tapping into the potential of dark data requires organizations to make strategic decisions and investment toward information protection, retention and mining. They need to be owned by a centralized team that can formulate information management policies and guidelines. If possible, federate the process of executing those guidelines to business functions or departments.

2. Fetch your information from data lakes. Set up centralized data lakes or reservoirs along with required encryption and access controls. Employ automated data classification and categorization process towards information management.

3. Metadata-fication. Some enterprise units have started employing advanced machine learning to encrypt, tag and classify on transport level – data in motion rather than data at source. Here, it is important to differentiate between raw source data versus processed data and store them separately, using varying controls in place.

4. Deep diving and data mining. While data retention and management caters to information controls for compliance, data mining generates newer opportunities. There is no swaying in that data can be useful in one form or another. However, data mining must have a business case associated with it. For example, if I am to provide appropriate recommendations to a customer, I will need to consider past buying trends of the customer. Toward this end, I need customer data of the past three years for generating accurate models.

Rather than sieving through a vast repository, if I can combine prioritized business problems, automated advanced data classifications and workflow systems, I would be able to generate quick results. This cognizance requires education and business augmentation units to employ data mining, towards improving customer satisfaction, increasing operational efficiency or creating new growth channels.

Well-rounded Consumption

Dark data can contain important information about the entity, be it an individual or an organization. From an intra-organizational point of view, this information can be used for management – information containment, fraud detection and threat prevention. From an external organization perspective, most of the information contained in dark data can be used for customer 360.

One point to keep in mind is that dark data does not need to be an elephant in the room. All it needs is a data-first leading to an analytics-first and finally an AI-first mind-set. This cause is further propelled by an implementable approach toward solving the dark data problem. There is light at the end of tunnel. Hopefully, you are in the right tunnel to start with!

Ganesh Moorthy has over 18 years of experience in the areas of solution and product development, solution architecting and innovations. He has built award winning IoT platforms for Industrial Internet initiatives, designed and developed near real-time machine leaning systems for unstructured data anonymization and text / voice analytics along with providing solutions on Big Data. He currently holds the position of Head of Engineering at Tredence, leading a variety of engineering functions along with product development and innovation.

Analytics data science news articles

Related Posts

  • 100
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 78
    January/February Social media, marketing & analytics, v. 4.0 Beyond SaaS: infrastructure, platform as a service Talent shortage: in search of deep analytical skills March/April Software survey: statistical analysis Data revolution: AI and machine learning IoT: devices, connectivity, IT and more May/June Cognitive computing: what’s next? Data quality: cleaning up messy…
    Tags: data
  • 74
    FEATURES Politics & Analytics: Who holds the keys to the White House? By Douglas A. Samuelson Predicting the 2016 U.S. presidential election: What the “13 Keys” forecast, what to watch for and why they might not matter. Missing Metric: The human side of sales analytics By Lisa Clark Exploring the…
    Tags: data
  • 70
    Features Shedding light on dark analytics New business asset: Leveraging advanced technologies to explore unstructured and “dark” data reveals hidden insights. By Nitin Mittal Dark data: Two sides of same coin Lost opportunity and security risk: Dark data can be tapped to generate more opportunities or remain in the dark,…
    Tags: data
  • 70
    Burtch Works, an executive recruitment agency specializing in big data and data science talent, recently released a couple of surveys that offer interesting insight into the data science job market, as well as the preferred modeling language/statistic tool for analytics professionals.
    Tags: data

Analytics Blog

Electoral College put to the math test

With the campaign two months behind us and the inauguration of Donald Trump two days away, isn’t it time to put the 2016 U.S. presidential election to bed and focus on issues that have yet to be decided? Of course not.


Accenture security report identifies top cyber threats

With recent large-scale cyberattacks signaling a growing front in destructive threats and business impact, a new midyear report from iDefense, part of Accenture Security, reveals how threat actors are continuing to evolve their ability to avoid detection. Based on in-depth analysis, the report anticipates a growth in the number of threat actors who are rapidly expanding their capabilities due to factors such as the proliferation of affordable, customizable and accessible tools and exploits. Read more →

Job searchers: It’s not just who you know, but how well you know them

While online networking sites enable individuals to increase their professional connections, to what extent do these ties actually lead to job opportunities? A new study in the INFORMS journal Management Science finds that, despite the ability to significantly increase the number of professional connections and identify more job leads with limited effort on these sites, unless the connection is a strong one, they typically will not lead to job offers.  Read more →



Essential Practice Skills for High-Impact Analytics Projects
Sept. 26-27, Executive Conference Center, Arlington, Va.

Foundations of Modern Predictive Analytics
Oct. 2-3, VT Executive Briefing Center, Arlington, Va.

2017 INFORMS Annual Meeting
October 22-25, 2017, Houston

2017 Winter Simulation Conference (WSC 2017)
Dec. 3-6, 2017, Las Vegas


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to