Text Analysis

Text Analysis

Working with large collections of historical records like letters, transcribed interviews, and newspapers can be very challenging and time consuming. Text analysis, or data mining, is a method of “reading” which treats words like they are data and allows us to read texts in new ways. The method may include many ways of statistically analyzing a corpus.

An interesting block quote from an undergraduate about their process and learning will go here and look beautiful.

With computing technology, we can ask questions like “Do Charles Dickens novels use more words related to smell than most other novels?” and “Did George Washington’s letters to his wife have different sentiments (emotional words) than letters to his generals?” or “How did the topics in State of the Union addresses change over the last century?” To answer these questions, text analysis can help us analyze patterns of word usage. We can find the most common words (term frequency), or the most important topics (topic modeling). We can use this information to make graphs showing changing patterns over time. We can even visualize a birds-eye view of a text with a “word cloud,” which displays the most common words as big and less common words as small.

In our History Harvest project, we used Voyant to build word clouds and other interactive visualizations that illustrate for the reader the frequencies of words gathered in the oral histories of harvested objects.