Even a new kid on the block (my favorite was Donny) knows that statistics can lie. And charts can aid and abet. To do our audience justice, we have to visualize data responsibly.
I sound like I’m on some moral high ground but even after being in the data world for 20 years I still fall into some traps and only recognize it when I start feeling the ick.
Here’s a round up:
Creating aspirational dashboards.
Friends, I found myself on yet another grant where the lead staff wanted me to create dashboards out of nothing. I mean, other consultants were charged with helping the grantees write vision statements, identify metrics to measure that vision, and set up data collection for those metrics.
The actual data wasn’t coming for years.
Hell, the metrics weren’t even going to be finalized for each grantee until 6 months into the project, at best.
But I was supposed to be creating dashboards… with what on them?
I’ve been in this situation before, where a client wants me to make aspirational dashboards.
As in, if we build it, they will come. Or at least, the grantees will get so dazzled by the dashboard they’ll be more motivated get through all the earlier stages of visioning, metric selection, and data collection. The dashboards were dangling carrots.
But you know the truth – this isn’t a process you can rush. And I knew that whatever mythical metrics I picked were destined to change. Several times over. Which means remaking each dashboard, on repeat.
It isn’t wise, sustainable, or responsible to make aspirational dashboards.
Conveniently leaving out important data.
There’s an infographic of the daily routines of a bunch of historical “creatives.” Here I’m showing you the Benjamin Franklin example. The rest are all white men (except Maya Angelou – so curious why she’s always the nod to diversity in things like this).
Presumably we’re here to study how to craft our own days to be just as productive and prolific.
Except these make no mention of Benjamin Franklin’s slaves. Or his wife. Which made it possible for him to spend the bulk of his time thinking and dining and taking the resolution of the day.
It’s irresponsible to visualize this subset data that doesn’t tell the whole story.
Converting small numbers to percentages.
Back in grad school stats classes, I was taught it was irresponsible to run inferential statistics on datasets smaller than 60. That you needed at least 30 in each group to run a simple t-test. Anything less than that and you just report the raw numbers.
Similarly, it’s misleading to convert small group sizes to percentages when you report your data.
If you interviewed a dozen people and two of them said “we should be able to wear crocs to the office” you gotta say it plainly.
Two people have bad taste in foot wear.
If you convert that to a percentage, it’s 16.7%! Which is large enough to make you think hey we should be paying attention to this data.
Tread carefully. And not in crocs.
Your data is pointless.
If you can’t come up with a solid takeaway idea about your data, don’t put it in a graph.
For example this graph:
Is just a graph of American cities by their first letter. There really isn’t a point.
Here’s an even more obvious example:
Don’t waste people’s willingness to extend you a tiny bit of their attention. Because people are going to look at your graph. It’s our nature.
So if you show them a graph that’s pointless, they quickly conclude you don’t have anything useful for them and they jet.
Visualizing pointless data isn’t visualizing responsibly. Make a point. Or just move on with your life.
This isn’t the same thing as your data being a flat line, with no change. That can absolutely be a point!
Your methods were so weak you risk causing bad decisions.
Look, I get it. The data source you thought was solid gold appears to have some duplicated values, large chunks of missing data, and no one is answering the help line.
Visualizing and publishing that data without being honest about its murkiness isn’t responsible.
I’ve totally been there! The client still wanted to see what I had. So I developed an icon system akin to your phone’s signal strength:
And I placed this next to the findings so my client could take this into consideration, knowing we did the best we could with the data available.
When we visualize data, we’re responsible for showing it in the most honest light possible.
It doesn’t matter how gorgeous you can make the garbage look, it’s still garbage.
Did you think of other ways data is visualized irresponsibly? (Amanda Makulec wrote up some ideas related to visualizing COVID data.)
If you’re suddenly feeling the ick about some of your own work, don’t beat yourself up or turn in your data visualizer ID card. Let’s just keep creating ways we could do it more responsibly.