Word Clouds Are Evil

Oct 14 2011 @ 7:55pm

Tigersextcloud1-550x369

Jacob Harris pens a manifesto:

Word clouds support only the crudest sorts of textual analysis, much like figuring out a protein by getting a count only of its amino acids. This can be wildly misleading; I created a word cloud of Tea Party feelings about Obama, and the two largest words were implausibly “like” and “policy,” mainly because the importuned word “don’t” was automatically excluded. (Fair enough: Such stopwords would otherwise dominate the word clouds.) A phrase or thematic analysis would reach more accurate conclusions. 

(Word cloud of Tiger Woods's sexts by Andrew Sedotal)