“Today’s student of literature must be adept at gathering evidence from individual texts and equally adept at mining digital text repositories,” according to English professor Matthew Jockers. John Sunyer reports on Jockers’s controversial “big data” approach to literature:
[Jockers] holds the distinction of being the first English professor to assign more than 1,200 novels in one class. “Luckily for the students, they didn’t have to read them,” he says.
In his recent book Macroanalysis: Digital Methods & Literary History (2013), Jockers publishes a list of the most influential writers of the 19th century. The study is based on an analysis of 3,592 works published from 1780 to 1900, he explains. It took a lot of digging, and a computer did it by cross-checking about 700 variables across the sample, including, for example, word frequencies and the absence or presence of themes such as death.
“Literary history would tell you to expect Charles Dickens, Thomas Hardy and Mark Twain to be at the top of the list,” says Jockers. But the data revealed that Sir Walter Scott and Jane Austen had the greatest effect on other authors, in terms of writing style and themes.