Share with your friends










Submit

Analytics Magazine

Five-Minute Analyst: Army-Navy football game

November/December 2013

Is the Army-Navy game a random contest between two well-matched opponents, and thus similar to a random walk?
Is the Army-Navy game a random contest between two well-matched opponents, and thus similar to a random walk?

SchrammBy Harrison Schramm, CAP

I usually write about fun, lighthearted articles, but now I’m going to write something controversial about an important topic: Army-Navy football. Being a service academy graduate (Navy), I spent a considerable portion of my life wishing for Navy victory; a wish that was denied for four years while I was at the U.S. Naval Academy, but one that has been returned in spades over the past 11 years – Navy is on an 11-game winning streak which began in 2002.

The Army-Navy football game is interesting mathematically in that it is difficult to imagine two long-standing rivals who are as evenly matched. Discussions between alumni (and current students) about the differences between a Navy midshipman and Army cadet really only highlight their similarities: Both met the academies’ rigorous physical, academic and honor standards. Both intend to serve in their respective military branches after graduation. Although there are some notable exceptions, the service academies do not produce many professional athletes.

While both schools boast that they have a superior football team, do they? Analytically speaking, are Army and Navy evenly matched? Consider Figure 1.

Figure 1: Cumulative results of Army-Navy football games, 1890-2004 [1]. Navy’s current winning streak is truncated to help obfuscate the result. One of these lines is the actual data; two of them are results of a random walk, which takes value +1, -1 with equal probability. Which one is the football scores? Check my twitter feed at @5MinuteAnalyst to find out!
Figure 1: Cumulative results of Army-Navy football games, 1890-2004 [1]. Navy’s current winning streak is truncated to help obfuscate the result. One of these lines is the actual data; two of them are results of a random walk, which takes value +1, -1 with equal probability. Which one is the football scores? Check my twitter feed at @5MinuteAnalyst to find out!

When considering a long-term sum of random variables like Army-Navy game outcomes, it is not sufficient to just ask if the sum of wins and losses is “near” zero. After all, there are other sequences that could result in a sum of zero, such as (-1, +1, -1, +1), etc. If the Army-Navy game were truly a random contest between two well-matched opponents, then we would expect it to be “like” the Random Walk.

 Table 1: Navy and Army conditional wins, truncated history (neglecting current Navy winning streak).
Table 1: Navy and Army conditional wins, truncated history
(neglecting current Navy winning streak).

The Random Walk is a rich and interesting statistical model. Entire books are devoted to its study, and I will not spoil your fun by trying to go through it in detail here [2]. The idea of the symmetric random walk is a simple one: Imagine flipping a coin; if it comes up “heads,” take one step to the right, and if comes up tails, take one step to the left. This experiment I just described is equally interesting to both elementary school children and graduate students. One of our assumptions is that the conditional probability of the next step is independent of the previous. For our (truncated) historical dataset, this is clearly true:

Figure 2: Autocorrelation plot of Navy football wins. This plot shows no significant seasonal effect, supporting the hypothesis that the Army Navy game results are a random process.
Figure 2: Autocorrelation plot of Navy football wins. This plot shows
no significant seasonal effect, supporting the hypothesis that the
Army Navy game results are a random process.

The probability of the next win being Army given the previous win was Army is 49 percent; the probability that the next win will be Navy given the previous win was Navy is 46 percent. Hypothesis testing shows that these values are not inconsistent with an assumed “even” match [3]. However, demonstrating the evenness of the match in this manner is not sufficient to demonstrate randomness; I could construct a deterministic sequence that also had these properties. What I’d really like to do is to consider the probability of the next game’s outcome given all the previous knowledge up to that point across the entire population. I’d also like to do this in an automated fashion with easily interpreted results. In a word, what I want to do is check the data’s autocorrelation. While this can be done in Excel, I have opted to shift to the statistical language R.

 Figures 3 (above) and 4 (below): Estimated probability of Navy win estimated by (L) entire history of the match to that date and (R) 10-game moving average. While the current Navy winning streak leads to a moving average of 1 for Navy, there have been substantial excursions from 50-50 previously. Note that the cumulative history tends to dilute the deviations from .5 because of the larger sample size.
Figures 3 (above) and 4 (below): Estimated probability of Navy win estimated by (L) entire history of the match to that date and (R) 10-game moving average. While the current Navy winning streak leads to a moving average of 1 for Navy, there have been substantial excursions from 50-50 previously. Note that the cumulative history tends to dilute the deviations from .5 because of the larger sample size.
Figures 3 (above) and 4 (below): Estimated probability of Navy win
estimated by (L) entire history of the match to that date and
(R) 10-game moving average. While the current Navy winning
streak leads to a moving average of 1 for Navy, there have been
substantial excursions from 50-50 previously. Note that the
cumulative history tends to dilute the deviations
from .5 because of the larger sample size.

We might also consider how the runs of wins and losses affect our estimates of the win probabilities for Navy. We present this for both cumulative history and a 10-game moving average as Figures 3 and 4.

This is all very interesting, but doesn’t much answer our original question, which is: “Are the excursions we see excessive?” This depends greatly on your point of view. There have now been 11 Navy victories in a row; the probability of a fair coin coming up ‘heads’ this many times consecutively is: 1211.?.0005?1:2,000. So from that point of view, the current run is excessive in favor of Navy. However, we could ask this question from a different point of view: What sort of deviation from “even” would be considered excessive? There are many beautiful theoretical results that I might consider if this were written for a different audience. However, this is Analytics, and most of us have to inform decision-makers at the end of our work. This brings up an important optimization problem – the difficulty of briefing an elegant but extraordinarily technical result against simply simulating the problem and providing empirical evidence. For this article, I simulated 10,000 random walks of length 113, and recorded the maximum deviation from zero. The results are shown in Figure 5. The current deviation of 12 is near the midpoint of the distribution – right at the 50th percentile – supporting the claim that over a 113-game history, it is not unusual to see excursions of this size.

Figure 5: Simulated maxima of 10,000 random walks. This simulation was executed by looping over a single command [4] in R.
Figure 5: Simulated maxima of 10,000 random walks. This simulation was executed by looping over a single command [4] in R.

So, while the current run of 11 wins may be rare, the fact that Navy is currently up by 12 is not, statistically speaking, a rare event in a rivalry of this length.

Figure 6: Cumulative score of Army-Navy football game.
Figure 6: Cumulative score of Army-Navy football game.

For those who are interested, I have also included a plot of the cumulative score of the football game itself (Figure 6). This plot does not show nearly as much variability as the game outcome (as measured by zero crossings). I include this graph as an interesting picture to think about (and possibly a topic for next November’s FMA)

As far as this year’s football game is concerned, I will be happy with either result. While I can hardly claim neutrality (and my classmates will never let me live it down), I think an Army win this year wouldn’t be such a bad thing – mostly because it would make beating Army more fun the following year.

Harrison Schramm (harrison.schramm@gmail.com) is an operations research professional in the Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional (CAP).

Editor’s note: The 2013 Army-Navy game will be played Dec. 14 at Lincoln Financial Field in Philadelphia and broadcast live on CBS with a 3 p.m. (ET) kickoff.

References

1. All data is from: http://www.history.navy.mil/special%20highlights/football/army-navy-scores.htm
2. See Brzeznik and Zastawniak, Basic Stochastic Processes, 2003 or similar.
3. p = .57 (2-sided test)
4. max(abs(cumsum(ifelse(runif(n) > .5, 1, -1))))

business analytics news and articles

 

Headlines

Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Computing Society Conference
Jan. 6-8, 2019; Knoxville, Tenn.

INFORMS Conference on Business Analytics & Operations Research
April 14-16, 2019; Austin, Texas

INFORMS International Conference
June 9-12, 2019; Cancun, Mexico

INFORMS Marketing Science Conference
June 20-22; Rome, Italy

INFORMS Applied Probability Conference
July 2-4, 2019; Brisbane, Australia

INFORMS Healthcare Conference
July 27-29, 2019; Boston, Mass.

2019 INFORMS Annual Meeting
Oct. 20-23, 2019; Seattle, Wash.

Winter Simulation Conference
Dec. 8-11, 2019: National Harbor, Md.

OTHER EVENTS

Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)

CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.