Five-Minute Analyst: California climate

Much has been said about climate and data in the popular press over the past few years.
What does the data tell us about climate change based on three California cities over a long period of time?

Harrison SchrammBy Harrison Schramm

This time on the Five-Minute Analyst, we take a look at climate data from three cities in California, our home state. A lot has been said about climate and data in the popular press over the past few years. We decided to take a look at three California cities – Bakersfield, Los Angeles and Fresno – over a long period of time (Figure 1). These cities were chosen because of their geographic diversity and length of record.

Figure 1: Annual averages for three California cities.

Historical climate data can be downloaded for free from the National Oceanographic and Atmospheric Agency (NOAA) []. While the download is free, there is a limit to the file size, and for this project multiple downloads were required. Like many data analysis tasks, the real challenge is obtaining and – as the kids say – “munging” the data. If you aren’t familiar, “munging” is the act of taking raw data and putting it in a useful form for analysis. Specifically, each sub-component of the historical record was approximately100,000 rows x 90 columns. Even with efficient tools in the R language, some “slimming” had to be performed. After processing, the climate data set had 1.8 million rows and eight columns.

Observations on the Climate of California from Data

Figure 2: Average temperature (by month) for three California cities.

The first task here, as it is in most analyses, it to plot the data. It is particularity important in a task like this to keep an “open mind” when looking at data; particularly – as we shall see below – when the effects are small. Averages by year may tend to ‘wash out’ seasonal effects; with this in mind we have included two plots showing the average (Figure 2) and maximum (Figure 3) temperature by month over a series of years. Upon looking at these plots, there is not an obvious “smoking gun” implying either the presence or absence of climate change. To do a more nuanced consideration, and look at this more precisely, we will perform a standard linear regression of average temperature vs. year.

Figure 3: Maximum temperature (by month) for three California cities.

From these regressions, we see that there is only one case where the trend in temperature is upward without question: Fresno. In Fresno, the evidence that the temperature is rising at approximately .039 degrees per year is pretty resounding (p value 3.9 x 10-7) (Figure 4). Bakersfield’s regression shows a rate of .06 degrees per year (p-value of .027), which most practitioners still consider to be significant (against an ∝ of .05). In Los Angeles, there is not sufficient evidence (with a linear model) to support temperature rise with this data (p-value .15).


Figure 4: Trend and regression line for average temperature (by year) in Fresno, Calif.

I hope that this little bit of data analysis will encourage our readers to think about this problem for themselves – specifically by obtaining their own data and repeating (or expanding upon) the work we do here. In the interests of scientific exploration, the code to this analysis is posted here. Given the upgrades in computing and availability of data, concerned citizens can simply do their own homework now and in the future.

Harrison Schramm (, CAP, PStat, is a principal operations research analyst at CANA Advisors, LLC, and a member of INFORMS. The author would like to thank his intern, Jesse Ruediger, for drawing attention to this problem, and also for collecting the data used in the analysis. The author would also like to thank his colleague Cara Albright for introducing him to NOAA data.

Using machine learning and optimization to improve refugee integration

Andrew C. Trapp, a professor at the Foisie Business School at Worcester Polytechnic Institute (WPI), received a $320,000 National Science Foundation (NSF) grant to develop a computational tool to help humanitarian aid organizations significantly improve refugees’ chances of successfully resettling and integrating into a new country. Built upon ongoing work with an international team of computer scientists and economists, the tool integrates machine learning and optimization algorithms, along with complex computation of data, to match refugees to communities where they will find appropriate resources, including employment opportunities. Read more →

Gartner releases Healthcare Supply Chain Top 25 rankings

Gartner, Inc. has released its 10th annual Healthcare Supply Chain Top 25 ranking. The rankings recognize organizations across the healthcare value chain that demonstrate leadership in improving human life at sustainable costs. “Healthcare supply chains today face a multitude of challenges: increasing cost pressures and patient expectations, as well as the need to keep up with rapid technology advancement, to name just a few,” says Stephen Meyer, senior director at Gartner. Read more →

Meet CIMON, the first AI-powered astronaut assistant

CIMON, the world’s first artificial intelligence-enabled astronaut assistant, made its debut aboard the International Space Station. The ISS’s newest crew member, developed and built in Germany, was called into action on Nov. 15 with the command, “Wake up, CIMON!,” by German ESA astronaut Alexander Gerst, who has been living and working on the ISS since June 8. Read more →



