Working with Voyant: A Review – Digital Public Humanities

Voyant is a Digital Humanities tool that allows an individual to “see through your text” as it states on voyant-tools.org. The software allows the user to conduct text analysis from various media such as PDF files and plain text files. Voyant is a great digital tool to conduct text analysis. The software was difficult to generate results that helped to describe how interviews contrasted across states. First I had to choose two words within the Cirrus word cloud and I selected slavery and master as slavery appeared consistently across all states while master was not in the top fifty of each state. Selecting words became difficult because certain words did not appear as frequently within states and the corpus as a whole. I could have selected different words but subsequent voyant functions such as Trends and Reader did not produce a graph when I selected less frequently used words such as freedmen. The software and functions and mostly straightforward with a few exceptions such as stopwords. Voyant analyzed successfully the WPA Slave Narratives as plain text files and produced descriptive charts, graphs, and analysis of the interviews after I included stopwords to account for punctuation and misspellings with the OCR software.

My general review for new users begins with the idea and practice of trial and error. When the user selects words to begin his analysis he will first have to eliminate stopwords and OCR misspellings such as em, dat, dey, and ol. I recommend making a collection of stopwords through a google doc. My first criticism of Voyant is that it does not save the stopwords that I include between voyant files. For example, if I add stopwords to a voyant text analysis of WPA Slave Narratives the stopwords I included would not be saved into a database. I could subsequently open a new tab for voyant analysis with A Study in Scarlet by Arthur Cona Doyle obtained through Project Gutenberg within the public domain and have to add past words back into voyant.

Next, the user should acknowledge and remain aware of biases within the WPA Slave Narratives. The interviews were conducted by the WPA Federal Writers’ Project from 1936 to 1938. The subsequent problems with the interviews are memory recall, former slaves may have been very young before during, or after the Civil War and Reconstruction. Freedmen may not remember everything that occurred during their childhood, this could result in inaccurate narratives and missing data. Subsequent biases could be that if freedmen were uncomfortable revealing information to an interviewer if he or she was a different race. Nevertheless, the WPA Slave Narratives are an essential primary resource to obtain more information on what slavery was like for those who lived through the 19th century.

After you add a pdf or plain text file into Voyant you are directed to five tools to begin your text analysis. The five tools are Cirrus, Reader, Trends, Summary, and Contexts. The Cirrus window on the top left contains the cirrus word cloud, terms that track frequently used words, the link which creates a network of frequent words with a curser that can be moved to include more or fewer words within the word cloud. The second window is Reader that contains the specific text file you are studying. The third window is Trends that contains a graph of frequent terms within the corpus. You can hover your cursor or mouse over the upper right of that window and select the square with the arrow pointing to the top right to export the graph shown into the visualization and select “a URL for this view (tools and data) which will open in a new window. A user is free to create the visualization or cancel at any time as well as reset the graph with the reset button located in the Trends window under the graph.

The Voyant Tools blue banner on the top of the screen has a rectangular Microsoft like the logo on the top left that allows you to select different tools such as corpus tools, document tools, visualization tools, grid tools, and other tools. The key to effective text analysis and data visualization is to eliminate redundant stopwords and common words if it is a pdf file such as Gutenberg or subsequent misspellings.

I believe that using Voyant effectively requires trial and error as well as patience to try new ideas and methods from uploading a plain text file to experimenting with the cirrus word cloud, working with the graph, and exporting a visualization. The user should consult the question marks at the top right of each window for more information and guidance. The question mark at the top right of the webpage will take the user to this webpage through Voyant and allow them to find answers on how to further use the software beyond a general approach explained within this post.

Leave a Reply Cancel reply