What makes for good data visualisation?

Hand drawn different types of infographics

Data visualisation is a hot topic at the moment. And with good reason: a picture paints a thousand words … and the better ones can convey clearer meaning than a similar volume of numbers.

There is also an ever-growing list of charts and infographics, both in the public domain and in research deliverables. They are not totally new of course as the Guardian newspaper demonstrated a while back with some historical examples. However, not all of today’s visualisations achieve their analytical/informational objectives. In order for data visualisation to be effective it is important that we keep sight of some long held principles about the visual display of data. I’d like to reprise them through three thought leaders in this area who each have different backgrounds but who share (and propagate) very similar core principles.

Edward Tufte combines the creative and the analytical as a ‘data theorist and sculptor’. He is the author of The Visual Display of Quantitative Information first published in 1983 (he also wrote a famous article for Wired magazine in 2003 entitled PowerPoint is evil). His list of the key attributes for graphical excellence includes:

  • Make large data sets coherent
  • Encourage the eye to compare different pieces of data
  • Be closely integrated with the statistical and verbal description of a dataset

Leland Wilkinson is another though leader in this area. Lee spans academia and commercial software as well as computer science and statistics. He is the original founder and developer of the SYSTAT statistical package where his philosophy was always to be graphical before being tabular.

Lee has collaboratively developed a theory known as the Grammar of Graphics. A key tenet of Lee’s approach is that we shouldn’t constrain ourselves to think of statistical visualisation in terms of the familiar typology (pie, bar, scatter etc). Rather we can look to build graphs from lower level elements and shapes. Lee’s approach is implemented as GPL in SPSS and as ggplot in R. To get the most out of it you need to do some programming but there are some more accessible user interfaces including the SPSS Visualization Designer.

Professor Hans Rosling has become a celebrity statistician (thanks in no small part to his BBC TV series “The Joy of Stats”). On his Gapminder web site you can see his dynamic bubble plot which illustrates GDP and PPP by country since 1800. This is a visualisation that manages to strike a difficult balance between cleverness and clarity.

The consistent thread among these 3 luminaries is that graphics should be clear and appropriate. A good graphic will allow the consumer to see the point and to draw their own analytical conclusions without any additional help. Many examples try to be too colourful and “clever” for their own good. Good graphics are all about clarity whereas bad graphics just obfuscate.