While pontificating about the dangers of digitizing the world's literature, Stephen Marche drops a great anecdote about the Google Books process:
[Larry Page's engineers] crowd-sourced textual correction at a minimal cost through a brilliant program called reCAPTCHA, which employs an anti-bot service to get users to read and type in words the Optical Character Recognition software can’t recognize. (A miracle of cleverness: everyone who has entered a security identification has also, without knowing it, aided the perfection of the world’s texts.)
But he doesn't trust the instinct to treat literature like data:
Algorithms are inherently fascistic, because they give the comforting illusion of an alterity to human affairs. "You don’t like this music? The algorithms have worked it out" is not so far from "You don’t like this law? It works objectively." Algorithms have replaced laws of human nature, the vital distinction being that nobody can read them. They describe human meanings but are meaningless.
Which is why algorithms, exactly like fascism, work perfectly, with a sense of seemingly unstoppable inevitability, right up until the point they don’t. During the Flash Crash of May 6, 2010, the Dow Jones lost nine percent of its value in five minutes. More recently, Knight Capital lost 440 million dollars at a rate of about 10 million dollars a minute due to what it called "a rogue algorithm." Algorithms cannot, of course, be rogue. But rogue is the term we have invented for algorithms that don’t do what they’re supposed to, which is as much as to say that their creators don’t comprehend what they’re doing.
We'll go digital, he acknowledges, but "insight remains handmade."