Share with your friends


Analytics Magazine

News & Trends: Threats and fake news, medical analysis gaps, study on deploying A.I.

Can social media data predict threats or identify fake news?

Conrad Tucker

Conrad Tucker

Can publicly available data from large-scale social media networks be used to help predict catastrophic events within the country’s infrastructure, such as threats to national security, the energy system or even the economy?

Conrad Tucker, associate professor of engineering design and industrial engineering, has received funding from the U.S. Air Force to investigate whether crowd-sourced data from social media can be used to not only detect threats but also prevent catastrophic events from happening in the future.

Tucker received $342,995 for the three-year project titled, “Transforming Large Scale Social Media Networks into Data-Driven, Dynamic Sensing Systems for Modeling and Predicting Real World Threats.”

“The challenge with using data that comes from a group is that people – or algorithms – can be unreliable,” explains Tucker. “So the major thrust of this project is to create algorithms that increase the reliability of the information that you can acquire from these publicly available sources.”

One of the obstacles researchers face in advancing machine learning – or using computers to predict outcomes – is the acquisition of high-quality data. It is typically costly and time-consuming to obtain large data sets. However, with the emergence of the Internet and social media networks, the availability of data is less of a challenge, and large sets of data from publicly sourced social networks are becoming more readily available.

“We live in an increasingly digitally connected world, and this connectivity actually presents challenges, like volatility,” said Tucker. “If one CEO’s tweet can send a stock’s price down billions of dollars, that is a huge threat to the company and its stakeholders. That is just one example of what we are looking to model with new algorithms that can analyze and predict such chaos.”

There have been other industries in which researchers have used publicly available data as a decision-making tool, such as in healthcare. Researchers in the field have explored the concept for disease surveillance or capturing the spread of epidemics.

“Regardless of what domain you are looking at, the fundamental problem is when you go to sample data or acquire information, how do you know which pieces of data to include in your model and how do you know which ones to leave out,” said Tucker. “That’s the biggest problem and the area in which this study is seeking to make one of the more significant contributions.”

Because of the growing prevalence of connectivity worldwide, new threats continue to emerge. Predicting those threats is also a major part of this project.

“One of the major threats to society in the 21st century is the integrity of information … how do people decipher what’s real and what’s fake? How do you start preventing misinformation from being disseminated via social media?” asked Tucker. “I think that’s going to be a very difficult notion to combat, especially as algorithms become better at generating human-readable text and images. I don’t have the answer yet, but hopefully this is a good start to finding out how we can get there.”

Life sciences research technology is in the midst of a digital arms race. Photo Courtesy of | © Kheng Ho Toh

Life sciences research technology is in the midst of a digital arms race.
Photo Courtesy of | © Kheng Ho Toh

Analysis gap puts medical R&D at risk

Life sciences research technology is in the midst of a digital arms race. Although research is leading to important discoveries, the pace of data creation is far outstripping the ability to store and analyze it. A heavy contributor to this phenomenon is next-generation sequencing of DNA. Human whole genome sets are typically hundreds of gigabytes in size. While current figures indicate that sequence data is doubling every seven to nine months, sequencing is still in its infancy. In 2014, an estimated 228,000 genomes were sequenced; that figure is now estimated to be over 1.6 billion genomes.

“This analysis gap threatens to slow the pace of important discoveries by forcing research organizations to allot an increasing portion of budgets to sharing and preserving research data,” says Jim D’Arezzo, CEO of Condusiv Technologies.

Genomics is only part of the problem. The field of connectomics maps neural connections and pathways in the brain, using nanometer-resolution electron microscopy. Connectomics data sets, the largest of which are in the 100-terabyte range, are soon expected to reach petabyte scales, largely driven by faster, higher-resolution electron microscopes. Dr. Dorit Hainen of the Sanford Burnham Prebys Medical Discovery Institute, for example, says that a soon-to-be-installed new microscope will produce high-resolution images at a rate of 400 frames per second, 10 times the speed of her current equipment.

Other large-scale undertakings, such as the Blue Brain Project, the 100K Genomes Project, the National Institutes of Health Human Microbiome Project, the BRAIN Initiative and the Cancer Moonshot will generate a total of hundreds of petabytes of data, and downstream analysis will generate even more. The burden of discovery in the life sciences is shifting from scientific methodologies to analytical frameworks and bioinformatics. To help facilitate this transition, universities such as Harvard have begun to offer data analytics courses, including programming, for career biologists.

Not only foundational research organizations, but also pharmaceutical and healthcare systems developers are being affected by the gap between data acquisition and analysis. Life sciences companies are now launching products at a more rapid rate … and in a greater number of therapy areas. As manufacturers focus on these innovative product launches, operating budgets remain strained by the simultaneous investments required, while their IT departments face an ongoing need to do more with limited resources. At the same time, the rapid and continuing evolution of technology puts additional pressure on IT organizations to deliver both innovation and efficiencies to their companies.

In a sense, notes D’Arezzo, medical research is simply coming up against a classic computing problem – the I/O bound state, in which the time it takes to complete a computation is determined principally by the period spent waiting for input/output operations to be completed.

D’Arezzo points out that, “Data analysis is inherently slower than data acquisition. It can be made a great deal faster, however, not by throwing money at new hardware, but by optimizing the performance of existing servers and storage. We are the world leaders in this area, and we have seen users of our software solutions more than double the I/O capability of storage and servers in their current configurations. For life sciences researchers grappling with rapidly expanding data sets, I/O optimization technology represents a safe, reasonably priced, highly effective way to increase their ability to perform analytics.”

As with most emerging or unfamiliar technologies, early adopters are facing many obstacles to the progress of A.I. in their organizations. Photo Courtesy of | © marinv

As with most emerging or unfamiliar technologies, early adopters are facing many obstacles to the progress of A.I. in their organizations.
Photo Courtesy of | © marinv

Nearly half of CIOs planning to deploy artificial intelligence

Meaningful artificial intelligence (A.I.) deployments are just beginning to take place, according to Gartner, Inc. Gartner’s 2018 CIO Agenda Survey shows that 4 percent of CIOs have implemented A.I., while a further 46 percent have developed plans to do so.

“Despite huge levels of interest in A.I. technologies, current implementations remain at quite low levels,” says Whit Andrews, research vice president and distinguished analyst at Gartner. “However, there is potential for strong growth as CIOs begin piloting A.I. programs through a combination of buy, build and outsource efforts.”

As with most emerging or unfamiliar technologies, early adopters are facing many obstacles to the progress of A.I. in their organizations. Gartner analysts have identified the following four lessons that have emerged from these early A.I. projects.

1. Aim low at first. “Don’t fall into the trap of primarily seeking hard outcomes, such as direct financial gains, with A.I. projects,” says Andrews. “In general, it’s best to start A.I. projects with a small scope and aim for ‘soft’ outcomes, such as process improvements, customer satisfaction or financial benchmarking.” Expect A.I. projects to produce, at best, lessons that will help with subsequent, larger experiments, pilots and implementations. In some organizations, a financial target will be a requirement to start the project.

2. Focus on augmenting people, not replacing them. Big technological advances are often historically associated with a reduction in staff head count. While reducing labor costs is attractive to business executives, it is likely to create resistance from those whose jobs appear to be at risk. In pursuing this way of thinking, organizations can miss out on real opportunities to use the technology effectively. “We advise our clients that the most transformational benefits of A.I. in the near term will arise from using it to enable employees to pursue higher-value activities,” Andrews added. Gartner predicts that by 2020, 20 percent of organizations will dedicate workers to monitoring and guiding neural networks.

3. Plan for knowledge transfer. Conversations with Gartner clients reveal that most organizations aren’t well-prepared for implementing A.I. Specifically, they lack internal skills in data science and plan to rely to a high degree on external providers to fill the gap. Fifty-three percent of organizations in the CIO survey rated their own ability to mine and exploit data as “limited” – the lowest level. Gartner predicts that through 2022, 85 percent of A.I. projects will deliver erroneous outcomes due to bias in data, algorithms or the teams responsible for managing them. “Data is the fuel for A.I., so organizations need to prepare now to store and manage even larger amounts of data for A.I. initiatives,” says Jim Hare, research vice president at Gartner.

4. Choose transparent A.I. solutions. A.I. projects will often involve software or systems from external service providers. It’s important that some insight into how decisions are reached is built into any service agreement. “Whether an A.I. system produces the right answer is not the only concern,” says Andrews. “Executives need to understand why it is effective and offer insights into its reasoning when it’s not.” Although it may not always be possible to explain all the details of an advanced analytical model, such as a deep neural network, it’s important to at least offer some kind of visualization of the potential choices. In fact, in situations where decisions are subject to regulation and auditing, it may be a legal requirement to provide this kind of transparency.

Analytics data science news articles

Related Posts

  • 65
    With the rise of big data – and the processes and tools related to utilizing and managing large data sets – organizations are recognizing the value of data as a critical business asset to identify trends, patterns and preferences to drive improved customer experiences and competitive advantage. The problem is,…
    Tags: data
  • 60
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 53
    Businesses are greatly expanding the autonomous capabilities of their products, services and manufacturing processes to better optimize their reliability and efficiency. The processing of big data is playing an integral role in developing these prescriptive analytics. As a result, data scientists and engineers should pay attention to the following aspects…
    Tags: data
  • 51
    It’s long been popular to talk about customer interaction data such as clickstream, social activity, inbound email and call center verbatims as “unstructured data.” Wikipedia says of the term that it “…refers to information that either does not have a pre-defined data model or is not organized in a pre-defined…
    Tags: data
  • 50
    Can publicly available data from large-scale social media networks be used to help predict catastrophic events within the country’s infrastructure, such as threats to national security, the energy system or even the economy? Conrad Tucker, associate professor of engineering design and industrial engineering, has received funding from the U.S. Air…
    Tags: data, threats, social, media


Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →



INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.


Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to