Share with your friends










Submit

Analytics Magazine

Bridging the data science gap

Given the shortage of qualified data scientists, companies are tapping the power of domain expertise from engineers.

Seth DeLandBy Seth DeLand

Data science is one of the most in-demand fields today, and there is no shortage of opportunity for those with the requisite computer science and statistics skills. Since 2012, demand for data scientists has risen 650 percent, yet there are still only 35,000 qualified data scientists working in the United States today according to data from LinkedIn. Because many companies do not have the time or resources to properly onboard and train new data scientists, some businesses are getting creative by leveraging the value that their incumbent engineers can offer by tapping their domain expertise.

Engineers Unlock Data Science Tools

Domain expertise is the differentiator that will allow businesses to harness the power of data science by building on trusted relationships with existing engineers. After all, these highly skilled employees have spent countless hours designing the tools their organizations rely on every day, and their familiarity with the DNA of the business makes them suitable to develop data science programs that fit the unique anatomy of their organization.

To fully tap into the power of an engineer’s domain expertise, some companies have found that the use of big data, machine learning and software tools such as MATLAB can help reduce training times and optimize the onboarding process. Through these tools, engineers can explore different data environments, overlay different methodologies to identify inefficiencies or errors, and maximize productivity within a recurring system.

The integration of skill sets also enables engineers to establish more effective data analysis strategies. Rather than sorting through superfluous data sets, or establishing machine learning processes on inconclusive data, they can develop new algorithms that repeatedly analyze sourced data, saving hours of time and, potentially, hundreds of thousands of dollars.

New Frontier for Domain Experts

Several cases can be made for how best to leverage big data to maximize profitability within a specific industry. For example, the power of the Internet and the rise of the always-on consumer are allowing informed engineers to extract data from thousands of new sources and develop even more sophisticated demographics profiles.

The integration of skill sets enables engineers to establish more effective data analysis strategies.

The integration of skill sets enables engineers to establish more effective data analysis strategies.
Source: ThinkStock

The combination of domain expertise and data science can be applied to industries such as pharmaceuticals, where it can help companies decide between developing a more cost-effective version of a lifesaving drug or investing in new medicines. By being more adept at analyzing the data made available by the Internet, domain experts can determine outcomes such as whether patients are purchasing appropriate prescriptions, or if the cost is displacing sales. In turn, businesses can better understand at what quantities and what price points medications are most profitable, while delivering them to a greater number of people.

Perhaps one of the most beneficial areas for an engineer to focus their domain expertise is on the manufacturing and production processes at their own company. While a data scientist can come in and examine the process for inefficiencies and mathematically pinpoint extraneous expenses, the engineer’s added business knowledge can help pinpoint program inefficiencies at a faster rate, diagnose why these problems exist, and even build new systems to improve the overall system design.

Technologies and Tools

The question now becomes, what technologies are best suited for implementing and optimizing domain expertise?

Hadoop and Spark – both extremely powerful systems within the Apache Software Foundation framework – allow data scientists to launch a complicated set of algorithms that perform analysis on large data sets, more commonly referred to as “big data.” Yet for those engineers that are accustomed to working with data in files on desktop machines, on network drives or in traditional databases, these new tools require a different way of accessing the data before analysis can even be considered. What’s more, with big data, there are often disc reads/writes, as well as data transfers across networks, which slow down computations for engineers if they are unfamiliar with the network.

Advanced modeling tools help to address these challenges by establishing connections to big data systems like Hadoop. Engineers can operate within a familiar environment when writing code that runs both on the data sample locally and on the full dataset in the big data system. With localized development of new algorithms, engineers can also apply the algorithm to a data analysis process that is common to their workflow, which is the key to reducing the time it takes for analysis of a full dataset.

Big data is a key enabler for implementing domain expertise in an existing workforce. In parallel, machine learning has become an essential part of the data collection process due to new algorithms that are needed to analyze the massive amount of data that’s readily available to engineers.

Whether the goal is to uncover relationships in data (unsupervised) or train a model that can predict new outcomes (supervised), there are hundreds of algorithms that can be used to develop specific models. Often the best way to figure out which algorithm will work for a problem is to simply test them out and compare results. Yet this can prove extremely challenging and time consuming for engineers that aren’t familiar with a wide variety of interfaces.

As with big data, software modeling tools address this problem by providing point-and-click apps that train and compare multiple machine learning models. This is critical for engineers transitioning into a data science role because it enables them to identify “quick wins” where machine learning is an improvement over the traditional iterative process. This approach also prevents them from spending days or weeks tuning a model to a dataset that is not well-suited to machine learning.

Big data and machine learning have always been poised to bring new solutions to long-standing business problems. Now, by coupling these technologies with the increased domain knowledge that engineers bring to the table, these tools can be taken far beyond traditional uses for web and marketing applications. Engineers will find they’re well-positioned to tackle problems that improve business outcomes at a broader level – a task once reserved for data scientists has been left in the capable hands of domain experts.

Seth DeLand is the product marketing manager, data analytics, MathWorks.

Analytics data science news articles

Related Posts

  • 100
    Given the shortage of qualified data scientists, companies are tapping the power of domain expertise such as engineers. By Seth DeLand Data science is one of the most in-demand fields today, and there is no shortage of opportunity for those with the requisite computer science and statistics skills. Since 2012,…
    Tags: data, engineers, domain
  • 83
    With the rise of big data – and the processes and tools related to utilizing and managing large data sets – organizations are recognizing the value of data as a critical business asset to identify trends, patterns and preferences to drive improved customer experiences and competitive advantage. The problem is,…
    Tags: data
  • 81
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 77
    Businesses are greatly expanding the autonomous capabilities of their products, services and manufacturing processes to better optimize their reliability and efficiency. The processing of big data is playing an integral role in developing these prescriptive analytics. As a result, data scientists and engineers should pay attention to the following aspects…
    Tags: data
  • 75
    Frontline Systems releases Analytic Solver V2018 for Excel Frontline Systems, developer of the Solver in Microsoft Excel, recently released Analytic Solver V2018, its full product line of predictive and prescriptive analytics tools that work in Microsoft Excel. The new release includes a visual editor for multi-stage “data science workflows” (also…
    Tags: data

Headlines

Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.

OTHER EVENTS

Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.