The Lines section of the Data Visualization Checklist helps us enhance reader interpretability by handling a lot of the junk, or what Edward Tufte called the “noise” in the graph. I’m referring to all of the parts of the graph that don’t actually display data or assist reader cognition. Create more readability by deleting unnecessary lines.
The default chart, on the left, has black gridlines. These stand out quite a bit because of how well black contrasts against the white chart background. But the gridlines shouldn’t be standing out so much because they are not the most important part of the graph (the data is! Or the data are! Whichever way you stand on the is/are debate, I still love you).
The revised graph, on the right, is more appropriate. I changed the gridline color to light gray. The gridlines are still visible, to help with interpreting the values of the data, but the gray color relegates them to the background, playing a supporting role, where they belong.
You wouldn’t keep these gridlines at all if you were to add data labels to each data point in the graph. If you add data labels, you have to delete your y-axis and the gridlines. Otherwise, we have redundant encoding and clutter. Also, let me be clear on this point – since I have a y-axis, the gridlines are necessary. I see cases where people hear me say “delete unnecessary lines” and they take out the gridlines, but when you do that, people have a hard time estimating the values in the graph. Gotta keep the gridlines if you have a corresponding axis.
Other UNnecessary lines include the border, any tick marks, and any axis lines. Delete, delete, delete. It feels good.
I know, I know: The most annoying thing about this graph was that it attempts to plot age and grade in the same space! There are two horizontal axes in the graph and each of those has its own y-axis, which are on two different scales. What goes with what? So confusing. Yet so common! People usually end up here because they want to show the relationship between two variables, but this actually adds more confusion even though the graph authors think it’s an attempt at clarity. A better option is often times to show both variables, just side by side.
Breaking the data apart makes it easier to interpret each variable, puts them both in appropriate graph types, and still allows for some basic comparisons. Read this post for other alternatives to a dual y-axis chart.
Choosing the right chart type and eliminating all of the extra noise from the data display allow these graphics to clearly show the story in your dataset.
Test your graph on its clarity using the Data Visualization Checklist.
I talk about this topic and a whole lot more in Chapter 2 of Presenting Data Effectively. Check it out.