In December I spent two days at the at the Folger’s Visualizing English Print seminar. It brought together people from the Folger, the University of Wisconsin, and the University of Strathclyde in Glasgow; about half of us were literature people, half computer science; a third of us were tenure-track faculty, a third grad students, and a third in other types of research positions (i.e., librarians, DH directors, etc.).
Over those two days, we worked our way through a set of custom data visualization tools that can be found here. Before we could visualize, we needed and were given data: a huge corpus of nearly 33,000 EEBO-TCP-derived simple text files that had been cleaned up and spit through a regularizing procedure so that it would be machine-readable (with loss, obviously, of lots of cool, irregular features—the grad students who wanted to do big data studies of prosody were bummed to learn that all contractions and elisions has been scrubbed out). They also gave us a few smaller, curated corpora of texts, two specifically of dramatic texts, two others of scientific texts. Anyone who wants a copy of this data, I’d be happy to hook you up.
From there, we did (or were shown) a lot of data visualization. Some of this was based on word-frequency counts, but the real novel thing was using a dictionary of sorts called DocuScope—basically a program that sorts 40 million different linguistic patterns into one of about 100 specific rhetorical/verbal categories (DocuScope was developed at CMU as a rhet/comp tool—turned out not to be good at teaching rhet/comp, but it is good at things like picking stocks). DocuScope might make a hash of some words or phrases (and you can revise or modify it; Michael Witmore tailored a DocuScope dictionary to early modern English), but it does so consistently and you’re counting on the law of averages to wash everything out.
After drinking the DocuScope Kool-Aid, we learned how to visualize the results of DocuScoped data analysis. Again, there were a few other cool features and possibilities, and I only comprehended the tip of the data-analysis iceberg, but basically this involved one of two things.
- Using something called the MetaData Builder, we derived DocuScope data for individual texts or groups of texts within a large corpus of texts. So, for example, we could find out which of the approximately 500 plays in our subcorpus of dramatic texts is the angriest (i.e., has the greatest proportion of words/phrases DocuScope tagges as relating to anger)? Or, in an example we discussed at length, within the texts in our science subcorpus, who used more first-person references, Boyle or Hobbes (i.e., which had the greater proportion of words/phrases DocuScope tags as first-person references). The CS people were quite skilled at slicing, dicing, and graphing all this data in cool combinations. Here are some examples. A more polished essay using this kind of data analysis is here. So this is the distribution of DocuScope traits in texts in large and small corpora.
- We visualized the distribution of DocuScope tags within a single text using something called VEP Slim TV. Using Slim TV, you can track the rise and fall of each trait within a given text AND (and this is the key part) link directly to the text itself. So, for example, this is an image of Margaret Cavendish’s Blazing-World (1667).
The red line charts lexical patterns that DocuScope tags as “Positive Standards.” You’ll see there is lots of blue (compared to red) at the beginning of Cavendish’s novel (when the Lady is interviewing various Bird-Men and Bear-Men about their scientific experiments), but one stretch in the novel where there is more red than blue (when the Lady is conversing with Immaterial Spirits about the traits of nobility). A really cool thing about Slim TV that could make it useful in the classroom: you can move through and link directly to the text itself (that horizontal yellow bar on the right shows which section of the text is currently being displayed).
So 1) regularized EEBO-TCP texts turned into spreadsheets using 2) the DocuScope dictionary; then use that data to visualize either 3) individual texts as data points within a larger corpus of texts or 4) the distribution of DocuScope tags within a single text.
Again, the seminar leaders showed some nice examples of where this kind of research can lead and a lots of cool looking graphs. Ultimately, some of the findings were, if not underwhelming, at least just whelming: we had fun discussing the finding that relatively speaking, Shakespeare’s comedies tend to use “a” and his tragedies tend to use “the.” Do we want to live in a world where that is interesting? As we experimented with the tools they gave us, at times it felt a little like playing with a Magic 8 Ball: no matter what texts you fed it, DocuScope would give you lots of possible answers, but you just couldn’t tell if the original question was important or figure out if the answers had anything to do with the question. So formulating good research questions remains, to no one’s surprise, the real trick.
A few other key takeaways for me:
1) Learn to love csv files or, better, learn to love someone from the CS world who digs graphing software;
2) Curated data corpora might be the new graduate/honors thesis. Create a corpora (e.g.s, sermons, epics, travel narratives, court reports, romances), add some good metadata, and you’ve got yourself a lasting contribution to knowledge (again, the examples here are the drama corpora or the science corpora). A few weeks ago, Alan Liu told me that he requires his dissertation advisees to have a least one chapter that gets off the printed page and has some kind of digital component. A curated data collection, which could be spun through DocuScope or any other kind of textual analysis program, could be just that kind of thing.
3) For classroom use, the coolest thing was VEP Slim TV, which tracks the prominence of certain verbal/rhetorical features within a specific text and links directly to the text under consideration. It’s colorful and customizable, something students might find enjoyable.
All this stuff is publicly available as well. I’d be happy to demo what we did (or what I can do of what we did) to anyone who is interested.
Associate Professor. Hartford Campus.
Specialties: Renaissance (poetry and prose), law and literature, textual editing.
View Gregory Kneidel’s Faculty Bookshelf page
Hilary Bogert-Winkler (Ph.D. candidate, History) and J. Asia Rowe (Ph.D. ’16, English) are participating in the fall symposium, “Political Thought in Times Crisis, 1640-1660,” sponsored by the Folger Institute Center for the History of British Political Thought. This symposium will examine the British crisis of the mid-seventeenth century as a global phenomenon, as well as explore the ways political thought interacted with other means of expressing change and instability during this period.
Nathan Braccio (Ph.D. candidate, History) is participating in the year-long “Researching the Archives” seminar, in which he will have the opportunity to use the Folger’s rich archival collections monthly as he works on his dissertation, “Clashing New Englands: Identity and the Parallel Geographies of Algonquian and English New England, 1600-1730.”
Professor Greg Kneidel (English) is participating in the seminar, “Visualizing English Print.” Funded by the Mellon Foundation, this seminar will introduce participants to ways of creating “scalable scholarship,” that will, in combination with traditional methods of literary study, assist participants in developing approaches to large corpora such as the EEBO-TCP transcriptions.
Those interested in participating in the Folger Institute’s spring offerings (a list of which may be found here) should note that the application deadline for many of these in January 17.
Since it started in September, I have been attending the Folger Institute’s Year-Long Dissertation Seminar: Researching the Archive. While attending the seminar once a month, I have spent time using the collections and beautiful reading room. The reading room experience is one of the best, including stained glass and tapestries, tea time in the afternoon, and complimentary coffee in the cloak room. Friendly scholars populate each of these spaces, and afternoon tea in particular provides visitors with the opportunity to discuss their work with other scholars.
While the Folger’s collections focus on English published works, it is still extremely useful for an Americanist like myself. I have spent most of my time looking at atlases, maps, and texts on surveying between 1570 and 1650. Christopher Saxton’s atlas of England has been especially interesting. Its beautifully colored and extremely detailed maps are a joy to look at and represent the cutting edge of English cartography at their time. They form the beginning of a cartographic genealogy that lasted for decades. But Saxton’s atlas and other English publications do not only inform the reader about English culture: they are the cultural texts that informed how English colonists understood North America.
16th and early 17th-century English texts are invaluable to Americanists who study the first few decades after colonization. It is important for us to remember that the ideas of the first settlers did not come from a void, but from a rich cultural and literary tradition in England. This tradition included not only religious texts and philosophical discussions, but technical manuals for skills like surveying as well. When the English began to survey and map America, it was from these texts that they drew their information. When they encountered moral dilemmas, they drew from English religious texts. One glance at the books held in the extensive libraries of important colonists like the Mather family confirm the importance of English literature for America.
The seminar itself is a two-and-a-half-hour discussion followed by a presentation from a visiting scholar. This year’s seminar is run by a historian, Keith Wrightson, and a literary scholar, James Siemon. The guest speakers have been great, and included Andy Wood and Lena Orlin. The combination of historians and literary scholars provides variety to the readings and discussions that is rare to find. Being the only Americanist in the seminar has been a great boon for me. The knowledge and perspectives of English historians and literary scholars has helped me rethink elements of my project or fill in gaps in my knowledge.
If you have the opportunity to attend the Dissertation Seminar at the Folger, I would highly recommend it. Washington is a great city to visit at any time of year, and the Folger is one of the most charming archives around. While mostly rare books, it also has numerous manuscript collections and several fascinating maps and atlases. The seminar is a great way to meet and engage with interesting scholars from around the country, and I would highly recommend it to Americanist grad students.
Nathan Braccio is a Ph.D candidate in the UCONN History Department. He received his B.A. and M.A. in history from American University. His research focuses on the conflux of geography and identity in 17th and 18th century New England. More information on his research can be found on his webpage nathanbraccio.com. Contact him at firstname.lastname@example.org.
 Image from Luna online database, courtesy of Folger Library.
Prof. Kenneth Gouwens (UConn History) writing about his research trip to the Folger Shakespeare Library
It’s always a rich opportunity to visit the Folger. At the peak of the August heat wave, I spent the two days in air-conditioned comfort working through rare books that I’d identified on an earlier trip as meriting more attention. Seated in the beautiful older wing, I first returned to Hieronymus Fabricius ab Aquapendente, one of the foremost anatomists in the initial generations after Vesalius’s On the Fabric of the Human Body (1543). As part of a larger project on the simian/human boundary in the Renaissance, I’ve analyzed just how Vesalius criticized the ancient physician Galen for dissecting barbary apes in lieu of human cadavers. Following the lead of Aristotle, Fabricius devoted attention not just to the human but to a variety of animals to assess how they propel themselves, to what extent they are capable of vocalizing, etc. My interest had been piqued by his pointing out how both Galen and Vesalius had erred, the latter, for example, in describing the musculature of the feet: clearly Fabricius was not one to shy away from going toe-to-toe with the greats. It turns out, though, that he invokes simians little if at all in his corrections of Vesalius. In short, my hunch didn’t pan out, but I was able to find that out efficiently and now know better how Fabricius fits into the story I’m telling.
More productive was directly comparing two books on prodigies: one by the Alsatian humanist Conrad Lycosthenes and the other by the English cleric Stephan Batman. Only when going through my notes and photos (for study purposes) of images had I noticed how closely Batman’s English resembled the Latin of Lycosthenes’s text (I’d looked at them months apart, two years ago). Sure enough, Batman’s The Doome warning all men to Iudgemente (1581), which he had “gathered out of sundrie approved authors,” turns out to be mostly a close translation of Lycosthenes’ Prodigiorum ac ostentorum chronicon (1557). Examining the books side by side enabled me to see just how closely the illustrations in Batman’s book also mimicked those of its antecedent. For example, there’s a strong family resemblance between their portrayals of a baboon (pauyon), a hairy animal of India that enjoys fruit and lusts after human females. In both cases we are told about a specimen of this beast on display in Germany in 1551.
Batman’s image of the tailed ape (cercopithecus), by contras t, is modeled more loosely upon that in Lycosthenes — which in turn is obviously based on the highly influential image in Breydenbach’s 1486 Latin book on a pilgrimage to the Holy Land. So, in a brief time at the Folger, I was able to see the distinctions and similarities, both literary and artistic, in how knowledge was being transmitted among these authors.
Rare books were of course central to the trip, but I’d be remiss not to mention afternoon tea in the Folger’s basement. Rather like the coffee bar at the Vatican Library, it provides a locus for shop-talk with others working in the collection. I highly recommend to all researchers that they carve out time for the tea. In fact, that’s where I got some key tips on questions to ask about my favorite image in the Folger, an engraving of a monkey wearing a ruff. But that’s a subject for another time. Warm thanks to UConn’s Folger Committee for making this trip possible!
We are pleased to announce the launch of the UConn Early Modern Studies Working Group, a program designed to foster community and collaboration among scholars and students of the early modern period. The Working Group will feature lectures and works-in-progress talks by UConn scholars and outside guest speakers, as well as other events related to early modern studies. The series is funded by the Humanities Institute in an effort to build upon the momentum created by UConn’s recent association with the Folger Shakespeare Library.
It is our hope that this program will have broad interdisciplinary appeal to anyone interested in the early modern period, including undergraduates, graduate students, and faculty. We will update this blog with news on upcoming events.
Hilary Bogert-Winkler, Ph.D. candidate, History
George Moore, Ph.D. candidate, English