Share with your friends










Submit

Analytics Magazine

Executive Edge: Graph databases, journalists & the Panama Papers

Mining huge data sets: The powerful technology behind one of the biggest data leaks in history.

Emil Eifrem is co-founder and CEO of Neo Technology, developers of the graph database Neo4j.By Emil Eifrem

The Panama Papers, the unprecedented leak of 11.5 million files from the database of the global law firm Mossack Fonseca, opened up the offshore tax accounts of the rich, famous and powerful – laying bare how they have exploited secretive offshore tax regimes for decades. At 2.6 terabytes of data, the Panama Papers is the biggest data leak in history, towering over the U.S. diplomatic cables released by WikiLeaks in 2010, or more recently, intelligence documents handed over by Edward Snowden.

The investigation into the Panamanian law firm’s dealings and that of its elite clients was the direct result of work carried out by journalists at The International Consortium of Investigative Journalists (www.icij.org). More than 370 reports from 80 countries worked on the data for a year, such was its scale. As part of its endeavors, the ICIJ also released a searchable database of 300,000 entities harvested from the Panama Papers and its offshore leaks investigation.

Key Takeaways

The Panama Papers displayed the murky side of offshore accounts, identifying high-ranking government and public officials and pushing some out of office. But another major aspect that stands out is the power of the data itself and how it was sifted. It wasn’t searched and manipulated by experienced data scientists, but by a team of journalists, many of whom would not identify themselves as very technical.

How did the journalists manage to pick out meaningful data from such huge, unstructured files? The answer is graph database technology, which enabled journalists to surface connections between the data, much like joining the dots, to form a picture.

Mar Cabra, head of the data and research unit at the ICIJ, has described graph database technology as “a revolutionary discovery tool that’s transformed our investigative journalism process.”

The unique skill of graph databases is their ability to spot and understand relationships between data at huge scale. Graph databases utilize structures made up of nodes, properties and edges to store data, unlike relational databases, which store the information in rigid tables. Graph databases then map the links between required entities.

This is a boon for investigative journalists, but it is also a powerful tool for any business looking to tackle big data and connected data issues.

Graph Connections

Graph databases are an excellent way to make sense of the terabytes of connected data in an efficient manner. Why? Because unlike relational databases, which break data down into tables, graph databases use a notational structure that mimics the way humans intuitively look at information. Once the data model is coded in a scalable architecture, a graph database is unbeatable at analyzing the connections in large, complex data sets. This enables any business to build and manipulate big data structures easily.

Tech giants such as Google, Facebook and LinkedIn have recognized the power of graph databases for some time. For example, Facebook and LinkedIn’s tools for mapping real-time networks and connections that let us walk through social networks are founded on graph technology. Now that graph database technology has started to go mainstream, this highly scalable connected data analysis is available to all organizations, from startups to blue chips and government.

Graph databases are set to come into their own with the Internet of Things (IoT), where billions of connected devices mean dealing with petabytes of data. Graph databases will enable enterprises to mine data in ways that just aren’t possible using data warehouses and relational database technology. Graph technology is increasingly becoming the tool of choice for international agencies, governments, financial services companies and enterprises looking to make real-time connections between data and discover the patterns that make up their relationships.

We will undoubtedly be hearing more about the power of graph databases in the business world as more and more organizations latch on to the unique capabilities it offers.


Emil Eifrem is co-founder and CEO of Neo Technology (http://neo4j.com/), developers of the graph database Neo4j.

Analytics data science news articles

business analytics news and articlesAnalytics data science news articlesSave

Related Posts

  • 62
    International Data Corporation (IDC) recently released a worldwide Big Data technology and services forecast showing the market is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015. This represents a compound annual growth rate (CAGR) of 40 percent or about seven times that of the overall…
    Tags: data, big, technology
  • 59
    March/April 2013 By Vijay Mehrotra As described in the previous edition of Analyze This!, I am currently working on a research study with Jeanne Harris at Accenture’s Institute for High Performance. Specifically, we are seeking to develop a quantitative and qualitative understanding of the current state of analytics practice. If…
    Tags: data, big, visualization, mining
  • 57
    Features Visualizing machine-learning analysis In the journey from analysis to data-driven outcomes, data visualization presents data in a powerful and credible way. By Navneet Kesher Calling the smart way Big data analytics increase call center productivity and reduce unwanted phone calls by calling at the right time. By Douglas A.…
    Tags: data, big, visualization
  • 56
    Organizations of all sizes and types are awash in data possibilities, yet most of them cannot capitalize on the potential for a variety of reasons. The good news, however, is that with the right decisions and focus, these possibilities can turn quickly into realized opportunities.
    Tags: data, technology, big
  • 55
    Features New paradigm: Service as a Software Leveraging the interconnectedness of business problems to accelerate better decision-making. By Deepinder Dhingra Journey from CSP to DSP Text mining will play a pivotal role in the transition from communications service providers to digital service providers. By Somnath De, Saibal Samaddar and Upasana…
    Tags: data, big, mining

Headlines

Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.

OTHER EVENTS

Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.