Share with your friends










Submit

Analytics Magazine

Five-Minute Analyst: State of the Union

September/October 2014

alt

By Harrison Schramm, CAP

Lately I’ve been interested in textual data, which has opened a whole new world of things to think – and write – about. One interesting thing about text data is that the entire world of written word becomes your analytic garden. While exploring this garden, I thought it would be interesting to take a look at the presidents’ State of the Union addresses through the years.

The State of the Union is an annual report from the president of the United States to Congress. It can be a venue for rolling out new policies and strategies. We can safely assume that each administration takes the preparation and delivery of this speech very seriously, and puts the best resources they have into it. Therefore, the addresses may be considered a “snapshot” of the writing style of their time. The speeches can be found for all the presidents at a number of places; I used the American Presidency Project [1]. For this analysis, we consider the first term speeches by the following presidents: Madison, Lincoln, Kennedy, Clinton, Bush (George W.) and Obama.

Calculating “readability” via machine methods seems difficult at first. Fortunately, there are a number of methods available. The one that I decided to use is the “Flesch-Kincaid Grade Level” [2], given by:

alt

This test has several desirable properties; it is straightforward to calculate because word and syllable counts are easily counted by machine. Second, it is invariant to the meaning of the specific passages, so it can be used equally against samples written in different styles or time periods. The second desirable property is also its main drawback. Specifically, the term “grade level” can be misleading, because it applies only to structure and not meaning. For example:

“It is a far, far better thing that I do than I have ever done; it is a far, far better rest that I go to than I have ever known” [3], and “I went to the grocery store, bought some rye bread and ate it all up” [4].

The statements are clearly written at different intellectual levels, but both score 3.6 on the Flesch-Kincaid scale.

Many packages are available to do this type of analysis; the analysis that follows was done using the “koRpus” [5] package for R. Our exploration of some presidents’ addresses are presented in Figure 1.

alt
Figure 1: Flesch-Kincaid (FK) grade level of State of the Union a0ddresses for selected presidents. The data suggests a decline in complexity from Madison and Lincoln to present. The most recent three presidents (Clinton, G.W. Bush and Obama) are statistically indistinguishable as measured by FK (ANOVA p = .66). This is a reflection of the writing style of the time, more than the education level of the various presidents (and their staffs). Lincoln and Kennedy are similar (p = .25), while Madison was writing in a different grade level (p = .001).

For sake of comparison, Lincoln’s Gettysburg Address [6] has a grade level of 11.5, The Enchiridion by Epictetus [7] has a grade level of 7.8, and my June Five-Minute Analyst column [8] has a grade level of 8.6. Conversely, “The Cat in the Hat” by Dr. Seuss, consisting of short sentences and monosyllabic words, scores -.36.

Does This Matter?

The writing style of the presidents is not only a reflection of themselves, but also of the times that they live and the audience to which they are speaking. It should not surprise us that in the modern era, presidents speaking to the entire electorate in real time via TV and radio have a lower grade level than Madison, who was speaking to a smaller audience. Those who wish for a more “intellectual” discourse with our leaders should consider the opening paragraph of Madison’s 1809 address:

“At the period of our last meeting I had the satisfaction of communicating an adjustment with one of the principal belligerent nations, highly important in itself, and still more so as presaging a more extended accommodation. It is with deep concern I am now to inform you that the favorable prospect has been over-clouded by a refusal of the British government to abide by the act of its minister plenipotentiary, and by its ensuing policy toward the United States as seen through the communications of the minister sent to replace him.”

It will be interesting in future years to see if the apparent difficulty of texts stabilizes, increases or decreases. At this moment, I would believe all three outcomes.


Harrison Schramm (harrison.schramm@gmail.com) is an operations research professional in the Washington, D.C., area. He is a member of INFORMS and a Certified Analytics Professional (CAP).

Notes & References

1. http://www.presidency.ucsb.edu/sou.php#menu

2. Kincaid, J. P., 1975, “Derivation of New Readability Formulas for Navy Enlisted Personnel,” Research Branch Report 8-75, Millington, Tenn.

3. The closing of “A Tale of Two Cities” by Charles Dickens.

4. A sentence I just made up for this purpose.

5. http://cran.r-project.org/web/packages/koRpus/index.html

6. http://www.abrahamlincolnonline.org/lincoln/speeches/gettysburg.htm

7. As translated by George Long: http://www.ptypes.com/enchiridion.html

8. http://www.analytics-magazine.org/july-august-2014/1080-five-minute-analyst-probabilistic-parking-problems

business analytics news and articles

 



Headlines

Fighting terrorists online: Identifying extremists before they post content

New research has found a way to identify extremists, such as those associated with the terrorist group ISIS, by monitoring their social media accounts, and can identify them even before they post threatening content. The research, “Finding Extremists in Online Social Networks,” which was recently published in the INFORMS journal Operations Research, was conducted by Tauhid Zaman of the MIT, Lt. Col. Christopher E. Marks of the U.S. Army and Jytte Klausen of Brandeis University. Read more →

Syrian conflict yields model for attrition dynamics in multilateral war

Based on their study of the Syrian Civil War that’s been raging since 2011, three researchers created a predictive model for multilateral war called the Lanchester multiduel. Unless there is a player so strong it can guarantee a win regardless of what others do, the likely outcome of multilateral war is a gradual stalemate that culminates in the mutual annihilation of all players, according to the model. Read more →

SAS, Samford University team up to generate sports analytics talent

Sports teams try to squeeze out every last bit of talent to gain a competitive advantage on the field. That’s also true in college athletic departments and professional team offices, where entire departments devoted to analyzing data hunt for sports analytics experts that can give them an edge in a game, in the stands and beyond. To create this talent, analytics company SAS will collaborate with the Samford University Center for Sports Analytics to support teaching, learning and research in all areas where analytics affects sports, including fan engagement, sponsorship, player tracking, sports medicine, sports media and operations. Read more →

UPCOMING ANALYTICS EVENTS

INFORMS-SPONSORED EVENTS

INFORMS Annual Meeting
Nov. 4-7, 2018, Phoenix

Winter Simulation Conference
Dec. 9-12, 2018, Gothenburg, Sweden

OTHER EVENTS

Applied AI & Machine Learning | Comprehensive
Starts Oct. 29, 2018 (live online)


The Analytics Clinic
Citizen Data Scientists | Why Not DIY AI?
Nov. 8, 2018, 11 a.m. – 12:30 p.m.


Advancing the Analytics-Driven Organization
Jan. 28–31, 2019, 1 p.m.– 5 p.m. (live online)


CAP® EXAM SCHEDULE

CAP® Exam computer-based testing sites are available in 700 locations worldwide. Take the exam close to home and on your schedule:


 
For more information, go to 
https://www.certifiedanalytics.org.