Share with your friends


Analytics Magazine

Five-Minute Analyst: California climate

Much has been said about climate and data in the popular press over the past few years. Source. ThinkStock

Much has been said about climate and data in the popular press over the past few years.
Source. ThinkStock

What does the data tell us about climate change based on three California cities over a long period of time?

Harrison SchrammBy Harrison Schramm

This time on the Five-Minute Analyst, we take a look at climate data from three cities in California, our home state. A lot has been said about climate and data in the popular press over the past few years. We decided to take a look at three California cities – Bakersfield, Los Angeles and Fresno – over a long period of time (Figure 1). These cities were chosen because of their geographic diversity and length of record.

Figure 1: Annual averages for three California cities.

Figure 1: Annual averages for three California cities.

Historical climate data can be downloaded for free from the National Oceanographic and Atmospheric Agency (NOAA) []. While the download is free, there is a limit to the file size, and for this project multiple downloads were required. Like many data analysis tasks, the real challenge is obtaining and – as the kids say – “munging” the data. If you aren’t familiar, “munging” is the act of taking raw data and putting it in a useful form for analysis. Specifically, each sub-component of the historical record was approximately100,000 rows x 90 columns. Even with efficient tools in the R language, some “slimming” had to be performed. After processing, the climate data set had 1.8 million rows and eight columns.

Observations on the Climate of California from Data

Figure 2: Average temperature (by month) for three California cities.

Figure 2: Average temperature (by month) for three California cities.

The first task here, as it is in most analyses, it to plot the data. It is particularity important in a task like this to keep an “open mind” when looking at data; particularly – as we shall see below – when the effects are small. Averages by year may tend to ‘wash out’ seasonal effects; with this in mind we have included two plots showing the average (Figure 2) and maximum (Figure 3) temperature by month over a series of years. Upon looking at these plots, there is not an obvious “smoking gun” implying either the presence or absence of climate change. To do a more nuanced consideration, and look at this more precisely, we will perform a standard linear regression of average temperature vs. year.

Figure 3: Maximum temperature (by month) for three California cities.

Figure 3: Maximum temperature (by month) for three California cities.

From these regressions, we see that there is only one case where the trend in temperature is upward without question: Fresno. In Fresno, the evidence that the temperature is rising at approximately .039 degrees per year is pretty resounding (p value 3.9 x 10-7) (Figure 4). Bakersfield’s regression shows a rate of .06 degrees per year (p-value of .027), which most practitioners still consider to be significant (against an ∝ of .05). In Los Angeles, there is not sufficient evidence (with a linear model) to support temperature rise with this data (p-value .15).


Figure 4: Trend and regression line for average temperature (by year) in Fresno, Calif.

Figure 4: Trend and regression line for average temperature (by year) in Fresno, Calif.

I hope that this little bit of data analysis will encourage our readers to think about this problem for themselves – specifically by obtaining their own data and repeating (or expanding upon) the work we do here. In the interests of scientific exploration, the code to this analysis is posted here. Given the upgrades in computing and availability of data, concerned citizens can simply do their own homework now and in the future.

Harrison Schramm (, CAP, PStat, is a principal operations research analyst at CANA Advisors, LLC, and a member of INFORMS. The author would like to thank his intern, Jesse Ruediger, for drawing attention to this problem, and also for collecting the data used in the analysis. The author would also like to thank his colleague Cara Albright for introducing him to NOAA data.

Analytics data science news articles

Related Posts

  • 72
    With the rise of big data – and the processes and tools related to utilizing and managing large data sets – organizations are recognizing the value of data as a critical business asset to identify trends, patterns and preferences to drive improved customer experiences and competitive advantage. The problem is,…
    Tags: data
  • 67
    The Internet of Things (IoT) is considered to be the next revolution that touches every part of our daily life, from restocking ice cream to warning of pollutants. Analytics professionals understand the importance of data, especially in a complicated field such as healthcare. This article offers a framework on integrating…
    Tags: data
  • 60
    Businesses are greatly expanding the autonomous capabilities of their products, services and manufacturing processes to better optimize their reliability and efficiency. The processing of big data is playing an integral role in developing these prescriptive analytics. As a result, data scientists and engineers should pay attention to the following aspects…
    Tags: data
  • 56
    Frontline Systems releases Analytic Solver V2018 for Excel Frontline Systems, developer of the Solver in Microsoft Excel, recently released Analytic Solver V2018, its full product line of predictive and prescriptive analytics tools that work in Microsoft Excel. The new release includes a visual editor for multi-stage “data science workflows” (also…
    Tags: data
  • 54
    Today, we live in a digital society. Our distinct footprints are in every interaction we make. Data generation is a default – be it from enterprise operational systems, logs from web servers, other applications, social interactions and transactions, research initiatives and connected things (Internet of Things). In fact, according to…
    Tags: data


Fighting terrorists online: Identifying extremists before they post content

New research has found a way to identify extremists, such as those associated with the terrorist group ISIS, by monitoring their social media accounts, and can identify them even before they post threatening content. The research, “Finding Extremists in Online Social Networks,” which was recently published in the INFORMS journal Operations Research, was conducted by Tauhid Zaman of the MIT, Lt. Col. Christopher E. Marks of the U.S. Army and Jytte Klausen of Brandeis University. Read more →

Syrian conflict yields model for attrition dynamics in multilateral war

Based on their study of the Syrian Civil War that’s been raging since 2011, three researchers created a predictive model for multilateral war called the Lanchester multiduel. Unless there is a player so strong it can guarantee a win regardless of what others do, the likely outcome of multilateral war is a gradual stalemate that culminates in the mutual annihilation of all players, according to the model. Read more →

SAS, Samford University team up to generate sports analytics talent

Sports teams try to squeeze out every last bit of talent to gain a competitive advantage on the field. That’s also true in college athletic departments and professional team offices, where entire departments devoted to analyzing data hunt for sports analytics experts that can give them an edge in a game, in the stands and beyond. To create this talent, analytics company SAS will collaborate with the Samford University Center for Sports Analytics to support teaching, learning and research in all areas where analytics affects sports, including fan engagement, sponsorship, player tracking, sports medicine, sports media and operations. Read more →



INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden


Making Data Science Pay
Oct. 29 -30, 12 p.m.-5 p.m.

Applied AI & Machine Learning | Comprehensive
Starts Oct. 29, 2018 (live online)

The Analytics Clinic
Citizen Data Scientists | Why Not DIY AI?
Nov. 8, 2018, 11 a.m. – 12:30 p.m.

Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:

For more information, go to