WHOA THERE: This checklist is now a website where you can upload your image and we will walk you through each checkpoint, helping you rate yourself.
Tweaked and clarified, here is the updated Data Visualization Checklist.
Ann and I adjusted 5 of the checkpoints.
Text size is hierarchical and readable
Titles are in a larger size than subtitles or annotations, which are larger than labels, which are larger than axis labels, which are larger than source information. The smallest text – axis labels – are at least 9 point font size on paper, at least 20 on screen. We updated this description to talk about what size goes where.
Here’s an example:
Both text size and use of shades of gray indicate the hierarchy of information, with the title being in the largest and darkest and the source information the lightest and smallest.
Labels are used sparingly
Focus attention by removing the redundancy. For example, in line charts, label every other year on an axis. Do not add numeric labels *and* use a y-axis scale, since this is redundant. We updated this description to discuss how one should choose either gridlines or labels. A terrifically bad example:
So much redundancy happening in this graph, but Mrs. Glosser definitely doesn’t need a y-axis with all those gridlines and the exact number labels on each marker. If the exact values are important, directly label. If the overall pattern is sufficient, use the y-axis.
Proportions are accurate
A viewer should be able measure the length or area of the graph with a ruler and find that it matches the relationship in the underlying data. Y-axis scales should be appropriate. Bar charts start axes at 0. Other graphs can have a minimum and maximum scale that reflects what should be an accurate interpretation of the data (e.g., the stock market ticker should not start at 0 or we won’t see a meaningful pattern). We updated this description to address the y axis debate.
This example should have an axis that starts at zero. (Chris Lysy calls this the Cable News Axis.)
But in this example, a y-axis that starts at zero would wash away all ability to interpret what’s important:
In cases like the stock market, zero would never be within the set of possible values (god forbid) so it doesn’t make sense to include it in the axis. For your own data, you might consider a minimum based on historic lows and a maximum based on your goal.
Axes do not have unnecessary tick marks or axis lines
Tick marks can be useful in line graphs (to demarcate each point in time along the y-axis) but are unnecessary in most other graph types. Remove axes lines whenever possible. We updated this description to say axes lines should be removed when possible.
Check out how this example uses no axis lines at all, which produces a cleaner graph:
A vertical axis line would have put an unnecessary division between the category label and it’s data, which doesn’t make sense. A horizontal axis line would have been useless.
Graph has appropriate level of precision
Use a level of precision that meets your audiences’ needs. Few numeric labels need decimal places, unless you are speaking with academic peers. Charts intended for public consumption rarely need p values listed. We updated this description to talk about when to use decimals or show p values.
This example from Brookings was posted to their blog, meaning the audience was a public one:
But note the asterisks with the 3 different levels of p. Few people in a public audience even know what p means and those people don’t care about the 3 different levels. Now, Brookings’ economics peers will care a lot about that sort of thing (but not the stupid sun in the background). Level of precision is audience dependent. (And don’t get me started on standardized coefficient value).
We have some super fun developments of the Checklist coming in the future but for now, download the updated Data Visualization Checklist and let it guide your graph development.