Share with your friends


Analytics Magazine

Beyond ‘Moneyball’: The rapidly evolving world of sports analytics, Part I

September/October 2011


The first of a three-part series defines and describes the current state of the field.

sports analytics
Professional baseball and basketball teams are leading the pack of sports organizations that are embracing analytics.

Benjamin Alamar Vijay Mehrotra

By Benjamin Alamar (Left) and Vijay Mehrotra (Right)

Over the past few years, the world of sports has experienced an explosion in the use of analytics. In this three-part series, we reflect on the current state of sports analytics and consider what the future of sports analytics may look like.

We define sports analytics as “the management of structured historical data, the application of predictive analytic models that utilize that data, and the use of information systems to inform decision makers and enable them to help their organizations in gaining a competitive advantage on the field of play.” Our definition is both expansive (in the sense that it includes not only statistical models but also the broader information value chain that surrounds these models) and restrictive (because it excludes traditional analytics applications such as demand forecasting, revenue management and financial modeling, all of which are certainly relevant in the business of professional sports). Our framework for sports analytics is presented in Figure 1.

Figure 1: A framework of sports analytics.
Figure 1: A framework of sports analytics.

Data management includes any and all processes associated with acquiring, verifying and storing data in an efficient manner. In a sports organization, data can come from a variety of sources and may be presented in many different forms.

As shown in Figure 1, the data management function will feed both the predictive analytics function and the information systems that support decision-makers. Given this crucial role, good data management is essential, and therefore missing, incomplete and/or inaccessible data inherently reduces the value of any other investments in analytics.

In many organizations, data is often stored in isolated silos, so that getting data is often not a smooth process. Different groups within an organization such as scouting or training may have extensive data on players that other groups either do not have access to or do not even know exist.

For example, the personnel group at one NFL team had been collecting extensive performance data on various groups of both opposing players and their own players. The coaching staff had no idea that the data existed, but when they did discover it, they had difficulty accessing it. The data resided in spreadsheets on the computers of the personnel group instead of being integrated into a common data archive. This is a common situation within professional sports organizations.

Predictive analysis, the next piece of the framework, is the process of applying statistical tools to data to gain insight into what is likely to happen in the future. In sports, this can involve the projection of the pro careers of amateur players, identifying how the strengths and weaknesses of an opponent will play out against your own team’s strengths and weaknesses, or assessing whether a free agent would fill a need on a team at an appropriate cost. Depending on the importance of the problem, the time until an answer is needed and the data available, these analyses can range from simple comparisons to extremely complicated and cutting-edge statistical analysis. The results of these analyses may feed directly into an intelligent information system that provides decision-makers with standardized results. Alternately, such results may be reported directly to a decision-maker for special projects that may be outside of any standard systems.

Information systems, the next component in the framework, are increasingly common in the world of sports. When designed and implemented correctly, such information systems typically allow for visualization and interactive analysis of relevant information from multiple sources in one place, organized in a meaningful way to provide insights for decision makers. For example, a cutting-edge sports information system might combine unstructured information from scouting reports, summary reports from multiple data sources and results from predictive models. Such a system not only provides a data-driven decision support platform and integrates data from multiple sources, but (as we will discuss in Part 2 of this series) also has the potential to fundamentally alter and enhance the way a decision-maker does his or her job.

Decision-makers are the ultimate customers for all components in the sports analytics framework. However, the modern professional sports organization typically has many different decision-makers, including the general manager, coaches, scouts, trainers, salary cap managers and other personnel executives. Decision-makers in different functional areas may utilize different data and models to tackle different types of questions. Conversely, as mentioned above, one key problem today is that decision-makers in one functional area (such as scouts) rarely have easy access to information generated by personnel in other areas (such as assistant coaches or salary cap managers).

To summarize, our definition and framework for sports analytics encompasses several different and related aspects associated with turning raw data into information that is valued by – and has an impact on – decision-makers in the world of sports.

Professional baseball and basketball teams are leading the pack of sports organizations that are embracing analytics.

An Explosion of Interest in Sports Analytics

Though still a nascent, unstructured field (as we will discuss in more detail in Part 3), interest and activity in sports analytics has been exploding in recent years.

While studies applying mathematical models to professional sports data can be traced back more than 50 years [1], it is important to remember what the world looked look like as recently as 2005 when the first issue of the Journal of Quantitative Analysis in Sports ( was published. At the time this journal was launched, only two or three NBA teams thought about using advanced statistics in connection with players and strategy. Michael Lewis’ seminal book [2], “Moneyball: The Art of Winning an Unfair Game,” about the Oakland A’s use of data and models had recently been published, and no one had yet thought seriously about the application of motion capture technology in the context of professional sports. Just six short years later, more than half of NBA teams now utilize the tools of analytics on the team side of the their operation, most MLB teams now consider analytics a normal part of baseball operations, and companies such as STATS LLC are installing cameras in NBA arenas and NFL stadiums to capture more and more data.

On a broader scale, the annual Sloan Sports Analytics Conference serves as a vivid symbol of the growth of sports analytics. The first Sloan conference took place in 2006 in a few classrooms on the MIT campus with less than 300 attendees. The 2011 conference was held at the Boston Convention Center and attracted more than 2,000 attendees.

An Explosion of Data

Data within a sports organization used to consist of individual box scores, player and team summary statistics, text-based scouting reports and raw game films. However, the data available to decision-makers has grown exponentially over the last 15 years.

Several factors have contributed to this explosion in data. Innovations in sports science, ranging from training routines to nutritional regimens, coupled with improved reporting from medical staffs and trainers have all come with their own data sets that are gathered and tracked somewhere within an organization. With improved communications via the Internet, the frequency and amount of information captured, stored and distributed by scouts and coaches at all levels has grown significantly. Thanks to increased computing power and reduced storage costs, historical data about the games themselves is now packaged into many different formats, with companies such as Stats LLC, StatDNA and Sports Data Hub emerging to provide organizations with high-quality historical data presented with unique summaries and indexes.

Finally, the advent of motion capture technology has expanded the data collected from each game. This technology tracks everything that moves on a field every 100th of a second. The impact of this is staggering for it transforms the amount of information captured for a single game from a few hundred rows of data to well over one million. Major League Baseball, the NBA and pro soccer teams have implemented this type of technology.

The result of all of this is clear: The world of sports generates far, far more data today than could have been imagined just a few short years ago. Dean Oliver, director of Publication Analytics at ESPN, has spoken of finding “data that can win championships.”

However, as the computer scientist Clifford Stoll has said, “Data is not information, information is not knowledge, knowledge is not understanding, understanding is not wisdom.” Too much time is still spent by analysts using their skills to try and answer questions that are not meaningful to decision-makers in pro sports. For example, very little is interesting about the next new statistic that ranks all NBA players. General managers do not lose sleep trying to figure out who the best player in the game is, as that information is neither accurate nor actionable. Conversely, Mark Cuban, owner and general manager of the Dallas Mavericks, has often cited studies that either predict or examine the effects of injuries as delivering useful and actionable information to his coaching staff and team.

In other words, despite the remarkable growth in the amount and variety of data available of examination and analysis, the world of sports analytics still faces the same ubiquitous challenge: How to get meaningful information into the hands – and minds – of the people who are in a position to make effective use of it.

In Part 2, we examine some of the predictive models that are being used to create actionable information in the world of sports today and the information systems that effectively deliver valuable information to decision-makers.

Benjamin Alamar ( is the founding editor of the Journal of Quantitative Analysis in Sports, a professor of sports management at Menlo College and the director of Basketball Analytics and Research for the Oklahoma City Thunder of the NBA. He is co-author of the annual “Football Outsiders Almanac” and a regular contributor to the Wall Street Journal.

Vijay Mehrotra ( is an associate professor, Department of Finance and Quantitative Analytics, School of Business and Professional Studies, University of San Francisco. He is also an experienced analytics consultant and entrepreneur, an angel investor in several successful analytics companies and a San Francisco Giants season-ticket holder.


  1. Lindsey, G. R. “Statistical Data Useful for the Operation of a Baseball Team,” Operations Research, Vol. 7, No. 2, March-April 1959, pp. 197-207.



Former INFORMS President Cook named to U.S. Census committee

Tom Cook, a former president of INFORMS, a founding partner of Decision Analytics International and a member of the National Academy of Engineering, was recently named one of five new members of the U.S. Census Bureau’s Census Scientific Advisory Committee (CSAC). The committee meets twice a year to address policy, research and technical issues relating to a full range of Census Bureau programs and activities, including census tests, policies and operations. The CSAC will meet for its fall 2018 meeting at Census Bureau headquarters in Suitland, Md., Sept. 13-14. Read more →

Gartner identifies six barriers to becoming a digital business

As organizations continue to embrace digital transformation, they are finding that digital business is not as simple as buying the latest technology – it requires significant changes to culture and systems. A recent Gartner, Inc. survey found that only a small number of organizations have been able to successfully scale their digital initiatives beyond the experimentation and piloting stages. “The reality is that digital business demands different skills, working practices, organizational models and even cultures,” says Marcus Blosch, research vice president at Gartner. Read more →

Innovation and speculation drive stock market bubble activity

A group of data scientists conducted an in-depth analysis of major innovations and stock market bubbles from 1825 through 2000 and came away with novel takeaways of their own as they found some very distinctive patterns in the occurrence of bubbles over 175 years. The study authors detected bubbles in approximately 73 percent of the innovations they studied, revealing the close relationship between innovation and stock market bubbles. Read more →



INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden


Applied AI & Machine Learning | Comprehensive
Sept. 10-13, 17-20 and 24-25

Advancing the Analytics-Driven Organization
Sept. 17-20, 12-5 p.m. LIVE Online

The Analytics Clinic: Ensemble Models: Worth the Gains?
Sept. 20, 11 a.m.-12:30 p.m.

Predictive Analytics: Failure to Launch Webinar
Oct. 3, 11 a.m.

Advancing the Analytics-Driven Organization
Oct. 1-4, 12 p.m.-5 p.m.

Applied AI & Machine Learning | Comprehensive
Oct. 15-19, Washington, D.C.

Making Data Science Pay
Oct. 29 -30, 12 p.m.-5 p.m.


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to