Analytics Magazine

Five-Minute Analyst: State of the Union

September/October 2014


By Harrison Schramm, CAP

Lately I’ve been interested in textual data, which has opened a whole new world of things to think – and write – about. One interesting thing about text data is that the entire world of written word becomes your analytic garden. While exploring this garden, I thought it would be interesting to take a look at the presidents’ State of the Union addresses through the years.

The State of the Union is an annual report from the president of the United States to Congress. It can be a venue for rolling out new policies and strategies. We can safely assume that each administration takes the preparation and delivery of this speech very seriously, and puts the best resources they have into it. Therefore, the addresses may be considered a “snapshot” of the writing style of their time. The speeches can be found for all the presidents at a number of places; I used the American Presidency Project [1]. For this analysis, we consider the first term speeches by the following presidents: Madison, Lincoln, Kennedy, Clinton, Bush (George W.) and Obama.

Calculating “readability” via machine methods seems difficult at first. Fortunately, there are a number of methods available. The one that I decided to use is the “Flesch-Kincaid Grade Level” [2], given by:


This test has several desirable properties; it is straightforward to calculate because word and syllable counts are easily counted by machine. Second, it is invariant to the meaning of the specific passages, so it can be used equally against samples written in different styles or time periods. The second desirable property is also its main drawback. Specifically, the term “grade level” can be misleading, because it applies only to structure and not meaning. For example:

“It is a far, far better thing that I do than I have ever done; it is a far, far better rest that I go to than I have ever known” [3], and “I went to the grocery store, bought some rye bread and ate it all up” [4].

The statements are clearly written at different intellectual levels, but both score 3.6 on the Flesch-Kincaid scale.

Many packages are available to do this type of analysis; the analysis that follows was done using the “koRpus” [5] package for R. Our exploration of some presidents’ addresses are presented in Figure 1.

Figure 1: Flesch-Kincaid (FK) grade level of State of the Union a0ddresses for selected presidents. The data suggests a decline in complexity from Madison and Lincoln to present. The most recent three presidents (Clinton, G.W. Bush and Obama) are statistically indistinguishable as measured by FK (ANOVA p = .66). This is a reflection of the writing style of the time, more than the education level of the various presidents (and their staffs). Lincoln and Kennedy are similar (p = .25), while Madison was writing in a different grade level (p = .001).

For sake of comparison, Lincoln’s Gettysburg Address [6] has a grade level of 11.5, The Enchiridion by Epictetus [7] has a grade level of 7.8, and my June Five-Minute Analyst column [8] has a grade level of 8.6. Conversely, “The Cat in the Hat” by Dr. Seuss, consisting of short sentences and monosyllabic words, scores -.36.

Does This Matter?

The writing style of the presidents is not only a reflection of themselves, but also of the times that they live and the audience to which they are speaking. It should not surprise us that in the modern era, presidents speaking to the entire electorate in real time via TV and radio have a lower grade level than Madison, who was speaking to a smaller audience. Those who wish for a more “intellectual” discourse with our leaders should consider the opening paragraph of Madison’s 1809 address:

“At the period of our last meeting I had the satisfaction of communicating an adjustment with one of the principal belligerent nations, highly important in itself, and still more so as presaging a more extended accommodation. It is with deep concern I am now to inform you that the favorable prospect has been over-clouded by a refusal of the British government to abide by the act of its minister plenipotentiary, and by its ensuing policy toward the United States as seen through the communications of the minister sent to replace him.”

It will be interesting in future years to see if the apparent difficulty of texts stabilizes, increases or decreases. At this moment, I would believe all three outcomes.

Harrison Schramm ( is an operations research professional in the Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional (CAP).

Notes & References


2. Kincaid, J. P., 1975, “Derivation of New Readability Formulas for Navy Enlisted Personnel,” Research Branch Report 8-75, Millington, Tenn.

3. The closing of “A Tale of Two Cities” by Charles Dickens.

4. A sentence I just made up for this purpose.



7. As translated by George Long:


