Share with your friends










Submit

Analytics Magazine

Bridging the data science gap

Given the shortage of qualified data scientists, companies are tapping the power of domain expertise from engineers.

Seth DeLandBy Seth DeLand

Data science is one of the most in-demand fields today, and there is no shortage of opportunity for those with the requisite computer science and statistics skills. Since 2012, demand for data scientists has risen 650 percent, yet there are still only 35,000 qualified data scientists working in the United States today according to data from LinkedIn. Because many companies do not have the time or resources to properly onboard and train new data scientists, some businesses are getting creative by leveraging the value that their incumbent engineers can offer by tapping their domain expertise.

Engineers Unlock Data Science Tools

Domain expertise is the differentiator that will allow businesses to harness the power of data science by building on trusted relationships with existing engineers. After all, these highly skilled employees have spent countless hours designing the tools their organizations rely on every day, and their familiarity with the DNA of the business makes them suitable to develop data science programs that fit the unique anatomy of their organization.

To fully tap into the power of an engineer’s domain expertise, some companies have found that the use of big data, machine learning and software tools such as MATLAB can help reduce training times and optimize the onboarding process. Through these tools, engineers can explore different data environments, overlay different methodologies to identify inefficiencies or errors, and maximize productivity within a recurring system.

The integration of skill sets also enables engineers to establish more effective data analysis strategies. Rather than sorting through superfluous data sets, or establishing machine learning processes on inconclusive data, they can develop new algorithms that repeatedly analyze sourced data, saving hours of time and, potentially, hundreds of thousands of dollars.

New Frontier for Domain Experts

Several cases can be made for how best to leverage big data to maximize profitability within a specific industry. For example, the power of the Internet and the rise of the always-on consumer are allowing informed engineers to extract data from thousands of new sources and develop even more sophisticated demographics profiles.

The integration of skill sets enables engineers to establish more effective data analysis strategies.

The integration of skill sets enables engineers to establish more effective data analysis strategies.
Source: ThinkStock

The combination of domain expertise and data science can be applied to industries such as pharmaceuticals, where it can help companies decide between developing a more cost-effective version of a lifesaving drug or investing in new medicines. By being more adept at analyzing the data made available by the Internet, domain experts can determine outcomes such as whether patients are purchasing appropriate prescriptions, or if the cost is displacing sales. In turn, businesses can better understand at what quantities and what price points medications are most profitable, while delivering them to a greater number of people.

Perhaps one of the most beneficial areas for an engineer to focus their domain expertise is on the manufacturing and production processes at their own company. While a data scientist can come in and examine the process for inefficiencies and mathematically pinpoint extraneous expenses, the engineer’s added business knowledge can help pinpoint program inefficiencies at a faster rate, diagnose why these problems exist, and even build new systems to improve the overall system design.

Technologies and Tools

The question now becomes, what technologies are best suited for implementing and optimizing domain expertise?

Hadoop and Spark – both extremely powerful systems within the Apache Software Foundation framework – allow data scientists to launch a complicated set of algorithms that perform analysis on large data sets, more commonly referred to as “big data.” Yet for those engineers that are accustomed to working with data in files on desktop machines, on network drives or in traditional databases, these new tools require a different way of accessing the data before analysis can even be considered. What’s more, with big data, there are often disc reads/writes, as well as data transfers across networks, which slow down computations for engineers if they are unfamiliar with the network.

Advanced modeling tools help to address these challenges by establishing connections to big data systems like Hadoop. Engineers can operate within a familiar environment when writing code that runs both on the data sample locally and on the full dataset in the big data system. With localized development of new algorithms, engineers can also apply the algorithm to a data analysis process that is common to their workflow, which is the key to reducing the time it takes for analysis of a full dataset.

Big data is a key enabler for implementing domain expertise in an existing workforce. In parallel, machine learning has become an essential part of the data collection process due to new algorithms that are needed to analyze the massive amount of data that’s readily available to engineers.

Whether the goal is to uncover relationships in data (unsupervised) or train a model that can predict new outcomes (supervised), there are hundreds of algorithms that can be used to develop specific models. Often the best way to figure out which algorithm will work for a problem is to simply test them out and compare results. Yet this can prove extremely challenging and time consuming for engineers that aren’t familiar with a wide variety of interfaces.

As with big data, software modeling tools address this problem by providing point-and-click apps that train and compare multiple machine learning models. This is critical for engineers transitioning into a data science role because it enables them to identify “quick wins” where machine learning is an improvement over the traditional iterative process. This approach also prevents them from spending days or weeks tuning a model to a dataset that is not well-suited to machine learning.

Big data and machine learning have always been poised to bring new solutions to long-standing business problems. Now, by coupling these technologies with the increased domain knowledge that engineers bring to the table, these tools can be taken far beyond traditional uses for web and marketing applications. Engineers will find they’re well-positioned to tackle problems that improve business outcomes at a broader level – a task once reserved for data scientists has been left in the capable hands of domain experts.

Seth DeLand is the product marketing manager, data analytics, MathWorks.

Analytics data science news articles

Related Posts

  • 100
    Given the shortage of qualified data scientists, companies are tapping the power of domain expertise such as engineers. By Seth DeLand Data science is one of the most in-demand fields today, and there is no shortage of opportunity for those with the requisite computer science and statistics skills. Since 2012,…
    Tags: data, engineers, domain
  • 83
    With the rise of big data – and the processes and tools related to utilizing and managing large data sets – organizations are recognizing the value of data as a critical business asset to identify trends, patterns and preferences to drive improved customer experiences and competitive advantage. The problem is,…
    Tags: data
  • 81
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 77
    Businesses are greatly expanding the autonomous capabilities of their products, services and manufacturing processes to better optimize their reliability and efficiency. The processing of big data is playing an integral role in developing these prescriptive analytics. As a result, data scientists and engineers should pay attention to the following aspects…
    Tags: data
  • 75
    Frontline Systems releases Analytic Solver V2018 for Excel Frontline Systems, developer of the Solver in Microsoft Excel, recently released Analytic Solver V2018, its full product line of predictive and prescriptive analytics tools that work in Microsoft Excel. The new release includes a visual editor for multi-stage “data science workflows” (also…
    Tags: data


Headlines

Former INFORMS President Cook named to U.S. Census committee

Tom Cook, a former president of INFORMS, a founding partner of Decision Analytics International and a member of the National Academy of Engineering, was recently named one of five new members of the U.S. Census Bureau’s Census Scientific Advisory Committee (CSAC). The committee meets twice a year to address policy, research and technical issues relating to a full range of Census Bureau programs and activities, including census tests, policies and operations. The CSAC will meet for its fall 2018 meeting at Census Bureau headquarters in Suitland, Md., Sept. 13-14. Read more →

Gartner identifies six barriers to becoming a digital business

As organizations continue to embrace digital transformation, they are finding that digital business is not as simple as buying the latest technology – it requires significant changes to culture and systems. A recent Gartner, Inc. survey found that only a small number of organizations have been able to successfully scale their digital initiatives beyond the experimentation and piloting stages. “The reality is that digital business demands different skills, working practices, organizational models and even cultures,” says Marcus Blosch, research vice president at Gartner. Read more →

Innovation and speculation drive stock market bubble activity

A group of data scientists conducted an in-depth analysis of major innovations and stock market bubbles from 1825 through 2000 and came away with novel takeaways of their own as they found some very distinctive patterns in the occurrence of bubbles over 175 years. The study authors detected bubbles in approximately 73 percent of the innovations they studied, revealing the close relationship between innovation and stock market bubbles. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden

OTHER EVENTS

Applied AI & Machine Learning | Comprehensive
Sept. 10-13, 17-20 and 24-25


Advancing the Analytics-Driven Organization
Sept. 17-20, 12-5 p.m. LIVE Online


The Analytics Clinic: Ensemble Models: Worth the Gains?
Sept. 20, 11 a.m.-12:30 p.m.


Predictive Analytics: Failure to Launch Webinar
Oct. 3, 11 a.m.


Advancing the Analytics-Driven Organization
Oct. 1-4, 12 p.m.-5 p.m.


Applied AI & Machine Learning | Comprehensive
Oct. 15-19, Washington, D.C.


Making Data Science Pay
Oct. 29 -30, 12 p.m.-5 p.m.


CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.