The Emergence of Sport Analytics
From baseball to football to basketball and hockey, analytics gets a year-round workout in the athletic arena.
By James J. Cochran
In a recent interview for an upcoming INFORMS podcast on sports and oprations search (www.scienceofbetter.org/podcast/cochran.html), INFORMS Director of Communications Barry List asked me how the analysis of sports data had changed since he became a fan in the early 1960s. Although I did give Barry a partial response during the interview, I have continued to consider this question in the days since. I have concluded that the analysis of sports data has changed in four fundamental ways over the past half century.
First and foremost, we now have an incredible amount of data available. During my early days as a sports fan, the only sources for statistics were trading cards, sports sections of local newspapers and the weekly Sporting News. Baseball cards offered a position player’s defensive position(s); number of at-bats; batting average; and total hits, doubles, triples, homeruns, runs scored and runs batted in for each of the past several seasons. For pitchers we learned of the number of games and innings in which they pitched; their earned run averages; and the total number of wins, losses and saves they earned over the past several seasons. From basketball cards we learned of a player’s defensive positions; his shooting percentage for field goals and free throws; and his total points, rebounds, assists, blocked shots, turnovers and fouls committed for recent seasons. Football and hockey cards provided similar information relevant to their respective sports. Sunday sports pages and the Sporting News offered weekly updates on these measures of performance for the current season (one of the reasons I became a newspaper delivery boy when I was a kid was so that I would have access to these statistics before anyone else in my neighborhood!).
Not Trivial Pursuits
Note that each of the statistics provided on the back of the old trading cards, in the Sunday sports sections or in the Sporting News was either an absolute or a relative frequency with which a player achieved some feat over a season or career. While these measures of player performance did yield some insight, for the thoughtful fan they frequently generated more questions than they answered. Was John Riggins a better runner than Eric Dickerson in poor playing conditions? Was Jack Morris’ high win totals an artifact of pitching for a very good team, or was he a truly great pitcher? At what point should an NHL team that is behind late in a game pull its goalie? How much was the Boston Celtics’ home court advantage worth? Why did the Philadelphia Flyers win almost every home game that Kate Smith opened by singing God Bless America (77 wins, 21 loses, and 4 ties in games that Ms. Smith opened with her rendition of this Irving Berlin classic )? Should Ron Santo be inducted into the National Baseball Hall of Fame?
While a newspaper’s daily sports page could provide some relief in the form of summary statistics for individual games, these summaries did not provide the data at the individual play level that would support in-depth analyses of these and other issues. Furthermore, collecting and recording data from daily sports pages for future analyses was a practical impossibility. Fortunately, we now have access to incredibly detailed data that are collected at the individual play level.
This leads us to the second profound change in the analysis of sports data that has occurred over the past half century. While the dramatic growth in electronic storage capabilities has facilitated the collection of these elaborate data sets, the growth in readily available computing power has enabled individuals with an interest in sports problems to perform very sophisticated analyses of these problems. Sports analysts have come a long way from calculation and comparison of simple batting averages and free throw percentages; they now make use of complex analytic methods such as Bayesian statistics, mathematical programming, neural networks, heuristic algorithms, simulation, decision analysis and multivariate statistics to gain insight — often in a highly integrated manner.
These increases in the amount and quality of available data and storage capacity and computational power of computers have served as encouragement of the third profound change in the analysis of sports data that has occurred over the past half century. Students of analytics and quantitative methods, many of whom first realized they were interested in this area of study through their fascination with sports statistics, began applying what they had learned to sports problems. Initially these were simply labors of love, and the producers of these analyses considered sports analytics to be a hobby. Eventually a few of these sports analytics enthusiasts banded together to form the Society for American Baseball Research (SABR) in August 1971 in Cooperstown, N.Y. . This group, which has grown to more than 6,700 members worldwide, gave its members opportunities to publish their findings in journals and books and present their work at annual national and regional conferences.
Late in the same decade, Bill James began publishing his annual series of Baseball Abstracts; these abstracts, which eventually became enormously popular, consisted of several articles summarizing the results of analyses James had executed on data from the most recent major league base-ball season. In publishing his baseball abstracts James coined the term sabermetrics (derived from the acronym SABR); this term is now recognized to mean the mathematical and statistical analysis of baseball records.
Ultimately some forward-thinking sports executives began reading about saber-metrics and became curious; they began wondering if these results could have real applications that would give them a competitive advantage. Their interest led to the fourth profound change in the analysis of sports data that has occurred over the past half century. While introduction and acceptance of sports analytics by the management of professional sports franchises was slow, a few teams did eventually begin to integrate sports analytics into their decision-making processes. (This phenomenon was captured most famously by Michael M. Lewis in the 2003 best-seller, “Moneyball: The Art of Winning an Unfair Game” . The book chronicles Oakland Athletics General Manager Billy Beane’s unconventional approach to running a Major League Baseball organization, specifically Beane’s reliance on “real statistical analysis” that often flies in the face of conventional baseball wisdom in everything from evaluating players to managing game situations.)
Of course, as some of these teams succeeded, other teams followed suit, and now most major sports franchises in North America attempt to incorporate some sports analytics into their decision-making processes. Some franchises, such as the Dallas Mavericks of the National Basketball Association, the Philadelphia Eagles of the National Football League and the Boston Red Sox of Major League Baseball are considered to be among the heaviest and most savvy users of sports analytics. (Is it a coincidence that the Boston Red Sox, a franchise that had not won a World Series since trading Babe Ruth to the New York Yankees in 1918, finally won world championships in 2004 and 2007 after heavily integrating sports analytics into its decision-making processes?)
These events have moved sports analytics out of introduction stage and squarely into the growth stage of its life cycle. Evidence of growth is all around us. Many instructors of quantitative methods use sports examples in the classroom; for several examples, see the special issue on “Sports in the O.R. classroom” published by INFORMS Transactions on Education in 2004 . Several instructors (including me) have designed and taught entire courses on sports analytics. These courses can be incredibly multidisciplinary; in addition to operations research and statistics, I drew from anatomy, physics, marketing, economics, finance, human anatomy and physiology, sociology, kinesiology and chemistry when initially designing a sports analytics course. Some universities offer majors or degree programs in sports management that include sports analytics coursework. The Institute for Operations Research and the Management Sciences (INFORMS ) boasts an online section on Operations Research in Sports (SpORts ), and the American Statistical Association has a section on Statistics in Sports (SiS, ). Both of these sections are very active, sponsoring sports analytic sessions at annual conferences and organizing entire symposia on sports analytics. At least two peer-reviewed academic journals also focus exclusively on publishing papers on applications of sports analytics: the Journal of Quantitative Analysis in Sports (JQAS, ) and the International Journal of Sports Science and Engineering (IJSSE, ). Finally, anecdotal evidence suggests that, after decades of resistance, the number of professional sports franchises that utilize sports analytics in their decision-making processes is growing rapidly.
What lessons can one take from the recent success of sports analytics? I believe the primary lesson is to make a strong effort to promote success stories that are truly relevant to and can be easily understood by the target audience. Until recently, most upper- and mid-level management in professional sports saw sports analytics as a highly technical set of mathematical techniques that are difficult to understand and apply. Sports analytics overcame this resistance by promoting and showcasing successful applications that are relatively simple (as well as those that are relatively complex); this allowed potential users to understand, appreciate and see sports analytics as relevant to their careers.
Additionally, sports analytics has done an excellent job of getting teens and young adults interested in their efforts; while they admittedly have an advantage (many teens and young adults have strong interests in sports), they have largely been able to overcome the intimidation that many of these young individuals feel with regard to mathematics-related applications. Finally, sports analysts have been extraordinarily patient and persistent; almost every individual who has been employed as a sports analyst has a compelling story of the forbearance that was required to find employment and establish her/his career in this area.
Perhaps by accident (or at least not in a coordinated effort), sports analytics found that showing the relevance of your successes to your target audience, finding and mentoring new talent and being persistent is an effective strategy for moving from the introduction to the growth stage of the product development cycle.
James J. Cochran (firstname.lastname@example.org) is an associate professor in the Department of Marketing and Analysis, College of Business, Louisiana Tech University.
4. www3.informs.org/site/ITE/ (volume 5, issue 1)