Share with your friends










Submit

Analytics Magazine

Five-Minute Analyst: The force is strong with correspondence analysis

Analytics data science news articles
Harrison Schramm and Matt Powers

By Harrison Schramm and Matt Powers

“I am one with the data and the data is with me”
– Chirrut Imwe

This article is going to do two things I’ve never done before: first is to include a co-author, and second is to write about the same topic using (almost) the same data. To recap, in “The Force Awakens,” Kylo Ren fears that he will succumb to the light because he is not as dark as his hero, Darth Vader. We considered this problem in July 2016 using “Darkside Envelopment Analysis.” We repeat the data used as Table 1 (spoiler alert) slightly updated to reflect events of “Rogue One.”

Our previous work “shot first” by using data envelopment analysis implemented in MS Excel’s standard Simplex LP solver to maximize the ratio of “goods” to “bads” for each force practitioner’s achievements. To complete our training, we must unlearn, and move from mathematical optimization to correspondence analysis (CA), in this case wielding R package “ca,” an elegant weapon for a more civilized age. In this, we will create a biplot of achievements and failures, with Vader as the reference (Figure 1).

Figure 1: Correspondence analysis biplot featuring blue achievement/failure points and red force practitioner points. The black lines are Euclidean distances between non-Vader practitioners and Vader (red, near center). Increased distance implies increased dissimilarity. Ren’s Vader-distance (2.08) is the greatest of the non-Vader candidates.

Figure 1: Correspondence analysis biplot featuring blue achievement/failure points and red force practitioner points. The black lines are Euclidean distances between non-Vader practitioners and Vader (red, near center). Increased distance implies increased dissimilarity. Ren’s Vader-distance (2.08) is the greatest of the non-Vader candidates.

By this metric, Luke is the most Vader-like. It also suggests that Ren’s journey to the dark side is not yet complete. CA indicator score analysis of data separated into achievements and failures suggests that Vader is not necessarily the dark standard to which Ren should strive to achieve. There is another.

“Make ten lines of code feel like a hundred!”
– Cassian Andor

Achievements Vader Ren Luke Palpatine
Planet-sized objects destroyed 1 4 1 0
Force Choking
Lightening Lifting
5 2 1 2
Aerial Victories 3 0 4 0
Planets Conquered 2
Hoth, Cloud City
0 1 10
(Chancellor)
Failures Vader Ren Luke Palpatine
Major Stations Lost 2 1 1 1
Temper-tantrums 1 2 1 0
Computer Drives Unrecovered 2 1 0 0

Table 1: Achievements and failures contingency table of Vader, Ren, Luke and Palpatine.

These indicator scores are calculated in three steps:

  1. Transform data into a contingency table.
  2. Use R’s ca package to create biplot row/column coordinates.
  3. Perpendicularly project column points onto row point lines and measure point-intercept distances to/from segment endpoints using a custom Rscript that performs the calculations onto the coordinates made available from the ca package.

This problem has the interesting – and surprisingly common characteristic – that the data fields are not inherently ordinal. While we might all agree that “destroying a planet (if you’re a Sith) or Death Star (for Jedi) is really good and that losing a Death Star is really bad,” but how do aerial victories compare to force choking and/or lightning lifting? Aerial victories are achievable by half-witted, scruffy-looking nerf herders, while force choking can punish a disturbing lack of faith.

We can create a more nuanced analysis by considering the CA indicator score analysis of achievements with multiple perpendicular projections. We will start by calculating Vader’s achievement CA indicator score set (see Figure 2).

Figure 2: Vader’s projections onto all six possible achievement lines. The ratio of point intercept distances to achievement line distances combines with weight differences to compute an overall CA indicator score for each practitioner.

Figure 2: Vader’s projections onto all six possible achievement lines. The ratio of point intercept distances to achievement line distances combines with weight differences to compute an overall CA indicator score for each practitioner.

The general formula for calculating a single score S via projection onto line (i,j) is:

equation

  • where R is the intercept distance d* over projection space while weights wi and wj are the assigned achievement weights. Applying this to our previous data, we get Table 2. Table 3 compares three final indicator score calculation methods.
Achievement Score Failure Score
Vader 12.44 5.61
Luke 9.82 2.38
Ren 6.70 5.87
Palpatine 5.60 1.57

Table 2: Force practitioner CA achievement and failure scores, sorted by achievement scores.

Achievement/Failure Ratio Normalized Difference CA Score Difference
Luke 4.13 0.84 7.44
Palpatine 3.57 0.41 4.03
Vader 2.22 0.20 6.83
Ren 1.14 -0.84 0.83

Table 3: Force practitioner indicator score comparisons, sorted by achievement/failure ratios.

This analysis agrees broadly with our previous work, but introduces a different way to consider these types of data sets.

Harrison Schramm (Harrison.schramm@gmail.com), CAP, PStat, is a principal operations research analyst at CANA Advisors, LLC, and a member of INFORMS. Matt Powers is an operations research analyst working in the Tidewater, Va., area. In addition to Star Wars, his research interests focus on international cooperation.

A technical note: Exploratory factor analysis of failure loads the same latent variable onto unrecovered computer drives and major stations lost, thereby confirming the relationship between increased station vulnerability and computer drive security while adding quantitative context as to why many Bothans died (and others) to retrieve the information on those drives.

A personal note: In the coming year, I don’t plan to have any regular co-authors, but would like to start bringing in some of the many padwans I’ve met along the way. It is my sincerest hope that eventually the students will become the masters.

Analytics data science news articles

Save

Save

Save

Save

Save

Save

Save

Related Posts

  • 56
    Deep within the astonishing volumes of raw information generated by business transactions, social media, search engines, IoT and countless other sources, valuable intelligence about customers, markets and organizations, lies waiting to be discovered.
    Tags: data, dark, analysis
  • 40
    Data science is more than just building machine learning models; it’s also about explaining the models and using them to drive data-driven decisions. In the journey from analysis to data-driven outcomes, data visualization plays a very important role of presenting data in a powerful and credible way. Structured data only…
    Tags: data, analysis
  • 32
    FEATURES Welcome to ‘worksocial’ world By Samir Gulati New approach, technology blends data, process and collaboration for better, faster decision-making. How to pick a business partner By David Zakkam and Deepinder Singh Dhingra Ten things to consider when evaluating analytics and decision sciences partners. Big data, analytics and elections By…
    Tags: data, analysis, save, analyst, table
  • 31
    Content/Interactive Marketing Opportunities Analytics-Magazine.org can help you build a successful content marketing program or interactive lead generation program. Enhance your position as an industry thought leader and expert in the analytics profession by promoting the following content formats on Analytics-Magazine.org. Product Videos Software Demonstrations White Papers Case Studiesa Research Reports…
    Tags: analysis, data
  • 30
    Many organizations have noticed that the data they own and how they use it can make them different than others to innovate, to compete better and to stay in business. That’s why organizations try to collect and process as much data as possible, transform it into meaningful information with data-driven…
    Tags: data

Analytics Blog

Electoral College put to the math test


With the campaign two months behind us and the inauguration of Donald Trump two days away, isn’t it time to put the 2016 U.S. presidential election to bed and focus on issues that have yet to be decided? Of course not.


Headlines

Gaining distribution in small retail formats brings big payoffs

Small retail formats with limited assortments such as Save-A-Lot and Aldi and neighborhood stores like Target Express have been growing in popularity in the United States and around the world. For brands, the limited assortments mean greater competition for shelf-space, raising the question of whether it is worth expending marketing effort and slotting allowances to get on to their shelves. According to a forthcoming study in a leading INFORMS scholarly marketing journal, Marketing Science, the answer is “yes.” Read more →

Cognitive computing a disruptive force, but are CMOs ready?

While marketing and sales professionals increasingly find themselves drowning in data, a new IBM study finds that 64 percent of surveyed CMOs and sales leaders believe their industries will be ready to adopt cognitive technologies in the next three years. However, despite this stated readiness, the study finds that only 24 percent of those surveyed believe they have strategy in place to implement these technologies today. Read more →

How weather can impact consumer purchase response to mobile ads

Among the many factors that impact digital marketing and online advertising strategy, a new study in the INFORMS journal Marketing Science provides insight to a growing trend among firms and big brands: weather-based advertising. According to the study, certain weather conditions are more amenable for consumer responses to mobile marketing efforts, while the tone of the ad content can either help or hurt such response depending on the current local weather. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

Essential Practice Skills for High-Impact Analytics Projects
Sept. 26-27, Executive Conference Center, Arlington, Va.

Foundations of Modern Predictive Analytics
Oct. 2-3, VT Executive Briefing Center, Arlington, Va.

2017 INFORMS Annual Meeting
October 22-25, 2017, Houston

2017 Winter Simulation Conference (WSC 2017)
Dec. 3-6, 2017, Las Vegas

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.