Share with your friends










Submit

Analytics Magazine

Bridging the data science gap

Given the shortage of qualified data scientists, companies are tapping the power of domain expertise such as engineers.

Seth DeLandBy Seth DeLand

Data science is one of the most in-demand fields today, and there is no shortage of opportunity for those with the requisite computer science and statistics skills. Since 2012, demand for data scientists has risen 650 percent, yet there are still only 35,000 qualified data scientists working in the United States today according to data from LinkedIn. Because many companies do not have the time or resources to properly onboard and train new data scientists, some businesses are getting creative by leveraging the value that their incumbent engineers can offer by tapping their domain expertise.

Engineers Unlock Data Science Tools

Domain expertise is the differentiator that will allow businesses to harness the power of data science by building on trusted relationships with existing engineers. After all, these highly skilled employees have spent countless hours designing the tools their organizations rely on every day, and their familiarity with the DNA of the business makes them suitable to develop data science programs that fit the unique anatomy of their organization.

To fully tap into the power of an engineer’s domain expertise, some companies have found that the use of big data, machine learning and software tools such as MATLAB can help reduce training times and optimize the onboarding process. Through these tools, engineers can explore different data environments, overlay different methodologies to identify inefficiencies or errors, and maximize productivity within a recurring system.

The integration of skill sets also enables engineers to establish more effective data analysis strategies. Rather than sorting through superfluous data sets, or establishing machine learning processes on inconclusive data, they can develop new algorithms that repeatedly analyze sourced data, saving hours of time and, potentially, hundreds of thousands of dollars.

New Frontier for Domain Experts

Several cases can be made for how best to leverage big data to maximize profitability within a specific industry. For example, the power of the Internet and the rise of the always-on consumer are allowing informed engineers to extract data from thousands of new sources and develop even more sophisticated demographics profiles.

The combination of domain expertise and data science can be applied to industries such as pharmaceuticals, where it can help companies decide between developing a more cost-effective version of a lifesaving drug or investing in new medicines. By being more adept at analyzing the data made available by the Internet, domain experts can determine outcomes such as whether patients are purchasing appropriate prescriptions, or if the cost is displacing sales. In turn, businesses can better understand at what quantities and what price points medications most profitable, while delivering them to a greater number of people.

Perhaps one of the most beneficial areas for an engineer to focus their domain expertise is on the manufacturing and production processes at their own company. While a data scientist can come in and examine the process for inefficiencies and mathematically pinpoint extraneous expenses, the engineer’s added business knowledge can help pinpoint program inefficiencies at a faster rate, diagnose why these problems exist, and even build new systems to improve the overall system design.

Technologies and Tools

The question now becomes, what technologies are best suited for implementing and optimizing domain expertise?

Hadoop and Spark – both extremely powerful systems within the Apache Software Foundation framework – allow data scientists to launch a complicated set of algorithms that perform analysis on large data sets, more commonly referred to as “big data.” Yet for those engineers that are accustomed to working with data in files on desktop machines, on network drives or in traditional databases, these new tools require a different way of accessing the data before analysis can even be considered. What’s more, with big data, there are often disc reads/writes, as well as data transfers across networks, which slow down computations for engineers if they are unfamiliar with the network.

Advanced modeling tools help to address these challenges by establishing connections to big data systems like Hadoop. Engineers can operate within a familiar environment when writing code that runs both on the data sample locally and on the full dataset in the big data system. With localized development of new algorithms, engineers can also apply the algorithm to a data analysis process that is common to their workflow, which is the key to reducing the time it takes for analysis of a full dataset.

Big data is a key enabler for implementing domain expertise in an existing workforce. In parallel, machine learning has become an essential part of the data collection process due to new algorithms that are needed to analyze the massive amount of data that’s readily available to engineers.

Whether the goal is to uncover relationships in data (unsupervised) or train a model that can predict new outcomes (supervised), there are hundreds of algorithms that can be used to develop specific models. Often the best way to figure out which algorithm will work for a problem is to simply test them out and compare results. Yet this can prove extremely challenging and time consuming for engineers that aren’t familiar with a wide variety of interfaces.

As with big data, software modeling tools address this problem by providing point-and-click apps that train and compare multiple machine learning models. This is critical for engineers transitioning into a data science role because it enables them to identify “quick wins” where machine learning is an improvement over the traditional iterative process. This approach also prevents them from spending days or weeks tuning a model to a dataset that is not well-suited to machine learning.

Big data and machine learning have always been poised to bring new solutions to long-standing business problems. Now, by coupling these technologies with the increased domain knowledge that engineers bring to the table, these tools can be taken far beyond traditional uses for web and marketing applications. Engineers will find they’re well-positioned to tackle problems that improve business outcomes at a broader level – a task once reserved for data scientists has been left in the capable hands of domain experts.

Seth DeLand is the product marketing manager, data analytics, MathWorks.

Related Posts

  • 100
    Data science is one of the most in-demand fields today, and there is no shortage of opportunity for those with the requisite computer science and statistics skills. Since 2012, demand for data scientists has risen 650 percent, yet there are still only 35,000 qualified data scientists working in the United…
    Tags: data, engineers, domain
  • 84
    With the rise of big data – and the processes and tools related to utilizing and managing large data sets – organizations are recognizing the value of data as a critical business asset to identify trends, patterns and preferences to drive improved customer experiences and competitive advantage. The problem is,…
    Tags: data
  • 81
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 77
    Businesses are greatly expanding the autonomous capabilities of their products, services and manufacturing processes to better optimize their reliability and efficiency. The processing of big data is playing an integral role in developing these prescriptive analytics. As a result, data scientists and engineers should pay attention to the following aspects…
    Tags: data
  • 75
    Frontline Systems releases Analytic Solver V2018 for Excel Frontline Systems, developer of the Solver in Microsoft Excel, recently released Analytic Solver V2018, its full product line of predictive and prescriptive analytics tools that work in Microsoft Excel. The new release includes a visual editor for multi-stage “data science workflows” (also…
    Tags: data


Headlines

Fighting terrorists online: Identifying extremists before they post content

New research has found a way to identify extremists, such as those associated with the terrorist group ISIS, by monitoring their social media accounts, and can identify them even before they post threatening content. The research, “Finding Extremists in Online Social Networks,” which was recently published in the INFORMS journal Operations Research, was conducted by Tauhid Zaman of the MIT, Lt. Col. Christopher E. Marks of the U.S. Army and Jytte Klausen of Brandeis University. Read more →

Syrian conflict yields model for attrition dynamics in multilateral war

Based on their study of the Syrian Civil War that’s been raging since 2011, three researchers created a predictive model for multilateral war called the Lanchester multiduel. Unless there is a player so strong it can guarantee a win regardless of what others do, the likely outcome of multilateral war is a gradual stalemate that culminates in the mutual annihilation of all players, according to the model. Read more →

SAS, Samford University team up to generate sports analytics talent

Sports teams try to squeeze out every last bit of talent to gain a competitive advantage on the field. That’s also true in college athletic departments and professional team offices, where entire departments devoted to analyzing data hunt for sports analytics experts that can give them an edge in a game, in the stands and beyond. To create this talent, analytics company SAS will collaborate with the Samford University Center for Sports Analytics to support teaching, learning and research in all areas where analytics affects sports, including fan engagement, sponsorship, player tracking, sports medicine, sports media and operations. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden

OTHER EVENTS

Applied AI & Machine Learning | Comprehensive
Starts Oct. 29, 2018 (live online)


The Analytics Clinic
Citizen Data Scientists | Why Not DIY AI?
Nov. 8, 2018, 11 a.m. – 12:30 p.m.


Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.