Share with your friends










Submit

Analytics Magazine

Five-Minute Analyst: The force is strong with correspondence analysis

Analytics data science news articles
Harrison Schramm and Matt Powers

By Harrison Schramm and Matt Powers

“I am one with the data and the data is with me”
– Chirrut Imwe

This article is going to do two things I’ve never done before: first is to include a co-author, and second is to write about the same topic using (almost) the same data. To recap, in “The Force Awakens,” Kylo Ren fears that he will succumb to the light because he is not as dark as his hero, Darth Vader. We considered this problem in July 2016 using “Darkside Envelopment Analysis.” We repeat the data used as Table 1 (spoiler alert) slightly updated to reflect events of “Rogue One.”

Our previous work “shot first” by using data envelopment analysis implemented in MS Excel’s standard Simplex LP solver to maximize the ratio of “goods” to “bads” for each force practitioner’s achievements. To complete our training, we must unlearn, and move from mathematical optimization to correspondence analysis (CA), in this case wielding R package “ca,” an elegant weapon for a more civilized age. In this, we will create a biplot of achievements and failures, with Vader as the reference (Figure 1).

Figure 1: Correspondence analysis biplot featuring blue achievement/failure points and red force practitioner points. The black lines are Euclidean distances between non-Vader practitioners and Vader (red, near center). Increased distance implies increased dissimilarity. Ren’s Vader-distance (2.08) is the greatest of the non-Vader candidates.

Figure 1: Correspondence analysis biplot featuring blue achievement/failure points and red force practitioner points. The black lines are Euclidean distances between non-Vader practitioners and Vader (red, near center). Increased distance implies increased dissimilarity. Ren’s Vader-distance (2.08) is the greatest of the non-Vader candidates.

By this metric, Luke is the most Vader-like. It also suggests that Ren’s journey to the dark side is not yet complete. CA indicator score analysis of data separated into achievements and failures suggests that Vader is not necessarily the dark standard to which Ren should strive to achieve. There is another.

“Make ten lines of code feel like a hundred!”
– Cassian Andor

Achievements Vader Ren Luke Palpatine
Planet-sized objects destroyed 1 4 1 0
Force Choking
Lightening Lifting
5 2 1 2
Aerial Victories 3 0 4 0
Planets Conquered 2
Hoth, Cloud City
0 1 10
(Chancellor)
Failures Vader Ren Luke Palpatine
Major Stations Lost 2 1 1 1
Temper-tantrums 1 2 1 0
Computer Drives Unrecovered 2 1 0 0

Table 1: Achievements and failures contingency table of Vader, Ren, Luke and Palpatine.

These indicator scores are calculated in three steps:

  1. Transform data into a contingency table.
  2. Use R’s ca package to create biplot row/column coordinates.
  3. Perpendicularly project column points onto row point lines and measure point-intercept distances to/from segment endpoints using a custom Rscript that performs the calculations onto the coordinates made available from the ca package.

This problem has the interesting – and surprisingly common characteristic – that the data fields are not inherently ordinal. While we might all agree that “destroying a planet (if you’re a Sith) or Death Star (for Jedi) is really good and that losing a Death Star is really bad,” but how do aerial victories compare to force choking and/or lightning lifting? Aerial victories are achievable by half-witted, scruffy-looking nerf herders, while force choking can punish a disturbing lack of faith.

We can create a more nuanced analysis by considering the CA indicator score analysis of achievements with multiple perpendicular projections. We will start by calculating Vader’s achievement CA indicator score set (see Figure 2).

Figure 2: Vader’s projections onto all six possible achievement lines. The ratio of point intercept distances to achievement line distances combines with weight differences to compute an overall CA indicator score for each practitioner.

Figure 2: Vader’s projections onto all six possible achievement lines. The ratio of point intercept distances to achievement line distances combines with weight differences to compute an overall CA indicator score for each practitioner.

The general formula for calculating a single score S via projection onto line (i,j) is:

equation

  • where R is the intercept distance d* over projection space while weights wi and wj are the assigned achievement weights. Applying this to our previous data, we get Table 2. Table 3 compares three final indicator score calculation methods.
Achievement Score Failure Score
Vader 12.44 5.61
Luke 9.82 2.38
Ren 6.70 5.87
Palpatine 5.60 1.57

Table 2: Force practitioner CA achievement and failure scores, sorted by achievement scores.

Achievement/Failure Ratio Normalized Difference CA Score Difference
Luke 4.13 0.84 7.44
Palpatine 3.57 0.41 4.03
Vader 2.22 0.20 6.83
Ren 1.14 -0.84 0.83

Table 3: Force practitioner indicator score comparisons, sorted by achievement/failure ratios.

This analysis agrees broadly with our previous work, but introduces a different way to consider these types of data sets.

Harrison Schramm (Harrison.schramm@gmail.com), CAP, PStat, is a principal operations research analyst at CANA Advisors, LLC, and a member of INFORMS. Matt Powers is an operations research analyst working in the Tidewater, Va., area. In addition to Star Wars, his research interests focus on international cooperation.

A technical note: Exploratory factor analysis of failure loads the same latent variable onto unrecovered computer drives and major stations lost, thereby confirming the relationship between increased station vulnerability and computer drive security while adding quantitative context as to why many Bothans died (and others) to retrieve the information on those drives.

A personal note: In the coming year, I don’t plan to have any regular co-authors, but would like to start bringing in some of the many padwans I’ve met along the way. It is my sincerest hope that eventually the students will become the masters.

Analytics data science news articles

Save

Save

Save

Save

Save

Save

Save

Related Posts

  • 56
    Deep within the astonishing volumes of raw information generated by business transactions, social media, search engines, IoT and countless other sources, valuable intelligence about customers, markets and organizations, lies waiting to be discovered.
    Tags: data, dark, analysis
  • 40
    Data science is more than just building machine learning models; it’s also about explaining the models and using them to drive data-driven decisions. In the journey from analysis to data-driven outcomes, data visualization plays a very important role of presenting data in a powerful and credible way. Structured data only…
    Tags: data, analysis
  • 32
    FEATURES Welcome to ‘worksocial’ world By Samir Gulati New approach, technology blends data, process and collaboration for better, faster decision-making. How to pick a business partner By David Zakkam and Deepinder Singh Dhingra Ten things to consider when evaluating analytics and decision sciences partners. Big data, analytics and elections By…
    Tags: data, analysis, save, analyst, table
  • 31
    Content/Interactive Marketing Opportunities Analytics-Magazine.org can help you build a successful content marketing program or interactive lead generation program. Enhance your position as an industry thought leader and expert in the analytics profession by promoting the following content formats on Analytics-Magazine.org. Product Videos Software Demonstrations White Papers Case Studiesa Research Reports…
    Tags: data, analysis
  • 31
    Features Visualizing machine-learning analysis In the journey from analysis to data-driven outcomes, data visualization presents data in a powerful and credible way. By Navneet Kesher Calling the smart way Big data analytics increase call center productivity and reduce unwanted phone calls by calling at the right time. By Douglas A.…
    Tags: data, analysis, analyst, table

Analytics Blog

Electoral College put to the math test


With the campaign two months behind us and the inauguration of Donald Trump two days away, isn’t it time to put the 2016 U.S. presidential election to bed and focus on issues that have yet to be decided? Of course not.


Headlines

Three keys for organizations to gain value from information

In the current information-driven society and increasingly digitalized world, Gartner, Inc. says that sentiments are shifting from the economics of tangible assets to the economics of information – “infonomics” – and other intangible assets. Infonomics is the theory, study and discipline of asserting economic significance to information. It strives to apply both economic and asset management principles and practices to the valuation, handling and deployment of information assets.  Read more →

Burtch Works study on ‘Salaries of Predictive Analytics Professionals’

According to the recently released Burtch Works study on “Salaries of Predictive Analytics Professionals 2017,” senior-level executives saw the largest increase in salaries from 2016 to 2017, and industry diversification of employment has diluted the concentration of such professionals from financial services and marketing/advertising to consulting and technology. Read more →

New study asks, ‘Is your business AI-ready?’

Despite fears that robots will replace human labor, the majority of artificial intelligence (AI) leaders (79 percent) expect their employees will work comfortably with robots by 2020, according to a new Genpact survey of C-Suite and senior executives titled, “Is Your Business AI-Ready?” Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

2017 Winter Simulation Conference (WSC 2017)
Dec. 3-6, 2017, Las Vegas

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.