Simply put, Textual Analysis involves quantitatively studying the use of language in a given text, looking at the frequency with which words are used, rather than a qualitatively study of the text itself. My initial reaction to this type of analysis was that it was counterintuitive or counterproductive with regards to studying history, since History is based almost entirely on qualitatively studying documents and texts. But as the article in Science Magazine “Quantitative Analysis of Culture Using Millions of Digitized Books“, the Mining the Dispatch project, and the Google Ngram Viewer demonstrate, taking a new and different approach to dealing with text can radically alter how we think about studying the past. 

The article in Science Magazine deals with a project to digitize roughly 5 million books, about 4% of all books ever printed, in order to better understand changes in language, collective memory, changes in culture and censorship, among others. By looking at the frequency with which a particular word is used, they were able to learn a great deal about how changes in language reflect changes in culture, caused by historical events. One of their examples involves looking at the frequency that the terms “The Great War”, “World War I”, and “World War II” are used. Looking at how often these terms are used between 1914 and 2013 using the Google Ngram Viewer, a Google project which applies textual analysis tools to the vast number of books digitized in the Google Books database, you can clearly see how popular usage of the term “Great War” dropped off significantly after the 1940s, replaced by “World War I”, and a significant increase in the use of “World War II”. 

There are some problems with this kind of approach to studying history. Simply looking at how often a term is used gives absolutely no context to the reader. It gives no reason for the decline in “Great War”, or the increase in “World War I” or “World War II”. As people studying History, we of course know that the term “World War I” didn’t exist until the early 1940s, because until then there was no need to differentiate between the Great War which began in 1914 and the Great War of 1939.

However, this lack of context can provide very interesting opportunities for Historians. On example is the Mining the Dispatch project, which applies Textual Analysis tools to digitized copies of the Richmond Daily Dispatch newspaper, between 1860 and 1865, in an attempt to better understand the city which played such a significant role in the American Civil War. By searching and graphing various terms, they were able to find specific trends in the use of various terms in the newspaper, giving a deeper look into life in the city.

Tools such as the Google Ngram Viewer definitely provide some rather fun and interesting opportunities, both for Historians and people who are generally interested in cultural trends. I personally had quite a bit of fun searching terms which have to do with my own personal interests, and then trying to figure out why the term is graphed the way it is. For example, I indulged the somewhat geekier side of me and searched “Godzilla”, looking specifically at English-language texts. The graph shows little-to-no use of the term until the early 1980s, steadily increasing until its use spiked in 1993, and sharply declining after 2000, but leveling off in the mid-2000s with a relatively high rate of usage. With a bit of reading, I figured that since the search was limited to English-language (ie, Western) sources, the term wasn’t popularized until the early 1980s when some of the Japanese films began to be edited and re-released for a Western audience. The release of Jurassic Park in 1993 accounts for the spike in its use in that year, due in part to the “giant lizard-monster” theme of the movie, and talks for an American remake around that same time. Use spiked in 1998 with the release of the film, and then sharply declined afterwards, but staying at a steady rate, significantly higher than it had been prior to 1993, with the character and series being introduced integrated into Western culture.

Some of the other interesting terms I searched, and then felt compelled to research include medical terms such as “Amputation”, and “Amputate”, which show a high rate of use until the 1790s when the words’ use declined as the procedure became less necessary, as well as terms such as “Communism” and “Communist”, which show a slow and steady increase, peaking during the American Red Scares, and slowly declining after 1990 and the fall of the Berlin Wall and Soviet Bloc.