Datamining The Classics, Ctd

In the growing field of “digital humanities,” researchers apply the datamining capabilities of computers to the history of literature. Dana Mackenzie addresses “perhaps the most frequently heard refrain in the criticisms of digital humanities: Where’s the beef? Where are the great insights?”:

Supporters argue that the digital humanities have produced new insights, but that the constellations of meaning it generates are not the kinds of insights humanists are used to. For example, when Ted Underwood, an English professor at the University of Illinois, topic-modeled 4,275 books written between 1700 and 1900, he noticed that changes in literature happen more gradually than we give them credit for.

For the first hundred years of that period, for example, the proportion of “old” Anglo-Saxon words in use declined. But over the century that followed, literature trifurcated. In poetry, the use of “old” words increased markedly. In fiction, “old” words also became more popular, but less dramatically. In nonfiction, however, the frequency of “old” words remained unchanged from the previous century. The data reflected a complex set of historical processes—the emergence of fiction and poetry that self-consciously broke from classical themes and instead treated the experiences of common people.

Such a change had often been attributed to the romantic school, but the data showed it playing out over a much longer period of time and continuing long after the romantics were supposedly passé. “Our vocabulary is all schools, movements, periods, cultural turns,” Underwood says. “If you have a trend that lasts a century or more, it’s really hard to grapple with.”

Digital humanities technologies can help us see gradual changes, whether in literature or elsewhere. Humans have difficulty comprehending change that happens on the time scale of a human life, or longer. If Underwood’s hypothesis is correct, we need computers to help fill in our blind spot. Topic modeling does not overturn or replace our previous ways of seeing; it enhances them. “It is not a substitute for human reading, but a prosthetic extension of our capacity,” says Johanna Drucker, a professor of information studies at UCLA.

Previous Dish on the digital humanities here and here.