Guest Post: Word Cloud Dog Vomit, An Illustrated Rant

My colleague Humphrey Costello delivers the funniest, snarkiest Ignite sessions at the American Evaluation Association’s annual conference. I’m so happy to give him this platform to articulate the answer to one of the questions I am asked the most, about the phenomenon known as word clouds.

1^st ingredient: Long ago, I promised Stephanie a blog entry on word clouds.

2^nd ingredient: AEA 2013 just happened.

3^rd ingredient: It was a thrill to find the latest copy of New Directions on data viz in my mailbox. Congratulations to Tarek, Stephanie, and everyone else involved!

Blend: Here is a word cloud of all AEA 2013 conference program session titles. I made it using Wordle. (The idea of playing around with conference session titles is not original. My partner, a sociologist, showed me a blog post that examined what sociologists were up to at ASA 2012—lots of examining implications and transgressing boundaries, IIRC.)

Through no fault of Wordle (which is easy and fun), this word cloud is crap. If you’re wondering what will go on at AEA 2013, this world cloud shows that there will be sessions on evaluation—not very helpful.

Deleting the terms ‘evaluation,’ ‘rotation,’ and ‘roundtable’ from the text yields this:

In their article in New Directions, Henderson and Segal say “when evaluators hear of the idea of visualizing qualitative data, a word cloud may be the first things that comes to mind.” Henderson and Segal are not advocating the use of word clouds; that sentence is immediately followed by “However,” and a litany of reservations.

Frankly, I think Henderson and Segal work too hard to find redeeming qualities and possibilities for word clouds. (They generously suggest that word clouds might be made useful if one could click on the words and be taken to relevant text.) No qualitative evaluator I know—not even those who maintain that quantitative methods are part and parcel of the oppressive positivist-corporate/capitalist patriarchy—would assert that counting words constitutes decent analysis.

Drawing on their erudition, here are a couple of cheap shots on word clouds:

Word clouds don’t show words’ relative importance accurately. Frequency = font size, regardless of word length. To my eye ‘Approaches’ looks bigger, weightier, than ‘Social’ even though both words are in the same font size and appear with the same frequency in the text. ‘Approaches’ takes up more real estate, and grabs more attention, just because it is a longer word.
Word clouds ignore words’ contexts. Without context, it is hard to know what is meant by words in the cloud. Is ‘Capacity’ evaluators’ capacity, their clients’ capacity, or the maximum amount their tummies can contain? (Gratuitous distraction: What are words for? https://www.youtube.com/watch?v=IasCZL072fQ)

The bigger problem that this word cloud is still near useless as an illustration of what took place at AEA. I challenge you to use this word cloud to come up with a single clear sentence about important themes of the conference.

Have you seen a useful word cloud? Link to it in the comments!

In the meantime, check out my new book, which doesn’t have a single word cloud in it. Or indulge me with an rant of your own at one of my upcoming events.

Rants are so fun, but we also need to draw inspiration from great examples – so check out the good stuff at thumbsupviz!

The Evergreen Data Visualization Certification Program is now open!

Today is your last chance to enroll in The Data Visualization Academy. This is where you totally change how you think about data.

Free class October 6: ✨ Which Viz Is It? ✨

From the Blog

When You’ve Gotta Graph 62 Counties

Learn something new?