Share with your friends










Submit

Analytics Magazine

Dark data: The two sides of the same coin

Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk. Photo Courtesy of 123rf.com | © aleksanderdn

Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk.
Photo Courtesy of 123rf.com | © aleksanderdn

Ganesh MoorthyBy Ganesh Moorthy

Today, we live in a digital society. Our distinct footprints are in every interaction we make. Data generation is a default – be it from enterprise operational systems, logs from web servers, other applications, social interactions and transactions, research initiatives and connected things (Internet of Things). In fact, according to a Digital Universe study, 2.2 zettabytes of data was generated in 2012. This grew by 100 percent in 2013, and is slated to grow to 44 zettabytes by 2020 worldwide.

The study further states that only 0.5 percent of the data generated is actually being analyzed. The study goes on to estimate that about 25 percent of the data, if properly managed, tagged and categorized, can be consumed for other purposes.

Enterprises have been collecting and storing data since the age of computers; dark data has always existed. But its close correlation to big data has made it a buzzword (or buzzkill, depending on your point of view) in current times. The challenge, though, is that we are simply not equipped to deal with this constant deluge of data. Compounding this effect is the fact that most of this unanalyzed data is unstructured. It takes more pre-processing and transformation efforts to make data ready for analytical consumption. So then, how do we manage dark data?

Organizations need to understand that any data left unexplored is an opportunity lost and a potential security risk. Based on an organization’s intent and investment appetite, dark data can either be tapped to generate more opportunities or remain in the dark, forever – the two sides of the same coin. We cannot, however, manage it like a coin toss, with a 50 percent probability of achieving heads or tails. Four best practices to keep in mind:

1. Make it a conscious investment. Tapping into the potential of dark data requires organizations to make strategic decisions and investment toward information protection, retention and mining. They need to be owned by a centralized team that can formulate information management policies and guidelines. If possible, federate the process of executing those guidelines to business functions or departments.

2. Fetch your information from data lakes. Set up centralized data lakes or reservoirs along with required encryption and access controls. Employ automated data classification and categorization process towards information management.

3. Metadata-fication. Some enterprise units have started employing advanced machine learning to encrypt, tag and classify on transport level – data in motion rather than data at source. Here, it is important to differentiate between raw source data versus processed data and store them separately, using varying controls in place.

4. Deep diving and data mining. While data retention and management caters to information controls for compliance, data mining generates newer opportunities. There is no swaying in that data can be useful in one form or another. However, data mining must have a business case associated with it. For example, if I am to provide appropriate recommendations to a customer, I will need to consider past buying trends of the customer. Toward this end, I need customer data of the past three years for generating accurate models.

Rather than sieving through a vast repository, if I can combine prioritized business problems, automated advanced data classifications and workflow systems, I would be able to generate quick results. This cognizance requires education and business augmentation units to employ data mining, towards improving customer satisfaction, increasing operational efficiency or creating new growth channels.

Well-rounded Consumption

Dark data can contain important information about the entity, be it an individual or an organization. From an intra-organizational point of view, this information can be used for management – information containment, fraud detection and threat prevention. From an external organization perspective, most of the information contained in dark data can be used for customer 360.

One point to keep in mind is that dark data does not need to be an elephant in the room. All it needs is a data-first leading to an analytics-first and finally an AI-first mind-set. This cause is further propelled by an implementable approach toward solving the dark data problem. There is light at the end of tunnel. Hopefully, you are in the right tunnel to start with!

Ganesh Moorthy has over 18 years of experience in the areas of solution and product development, solution architecting and innovations. He has built award winning IoT platforms for Industrial Internet initiatives, designed and developed near real-time machine leaning systems for unstructured data anonymization and text / voice analytics along with providing solutions on Big Data. He currently holds the position of Head of Engineering at Tredence, leading a variety of engineering functions along with product development and innovation.

Analytics data science news articles

Related Posts

  • 100
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 78
    Nearly 40 percent of data professionals spend more than 20 hours per week accessing, blending and preparing data rather than performing actual analysis, according to a survey conducted by TMMData and the Digital Analytics Association. More than 800 DAA community members participated in the survey held earlier this year. The…
    Tags: data
  • 74
    FEATURES Politics & Analytics: Who holds the keys to the White House? By Douglas A. Samuelson Predicting the 2016 U.S. presidential election: What the “13 Keys” forecast, what to watch for and why they might not matter. Missing Metric: The human side of sales analytics By Lisa Clark Exploring the…
    Tags: data
  • 70
    Features Shedding light on dark analytics New business asset: Leveraging advanced technologies to explore unstructured and “dark” data reveals hidden insights. By Nitin Mittal Dark data: Two sides of same coin Lost opportunity and security risk: Dark data can be tapped to generate more opportunities or remain in the dark,…
    Tags: data
  • 70
    Burtch Works, an executive recruitment agency specializing in big data and data science talent, recently released a couple of surveys that offer interesting insight into the data science job market, as well as the preferred modeling language/statistic tool for analytics professionals.
    Tags: data

Analytics Blog

Electoral College put to the math test


With the campaign two months behind us and the inauguration of Donald Trump two days away, isn’t it time to put the 2016 U.S. presidential election to bed and focus on issues that have yet to be decided? Of course not.




Headlines

Survey: Despite the hype, AI adoption still in early stages

The hype surrounding artificial intelligence (AI) is intense, but for most European businesses surveyed in a recent study by SAS, adoption of AI is still in the early or even planning stages. The good news is, the vast majority of organizations have begun to talk about AI, and a few have even begun to implement suitable projects. There is much optimism about the potential of AI, although fewer were confident that their organization was ready to exploit that potential. Read more →

Data professionals spend almost as much time prepping data as analyzing it

Nearly 40 percent of data professionals spend more than 20 hours per week accessing, blending and preparing data rather than performing actual analysis, according to a survey conducted by TMMData and the Digital Analytics Association. More than 800 DAA community members participated in the survey held earlier this year. The survey revealed that data access, quality and integration present persistent, interrelated roadblocks to efficient and confident analysis across industries. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

2017 Winter Simulation Conference (WSC 2017)
Dec. 3-6, 2017, Las Vegas

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.