Sunday, 10 July 2016

I usually like the BBC's data visualisations - but then this happened...

The BBC website generally provides excellent examples of how to present information. I have learnt a lot from studying material produced by the BBC. So imagine my disappointment when I encountered the following today ( http://www.bbc.co.uk/sport/formula1/36671922):

The first issue is why the BBC felt that it was necessary to have a chart at all. There are three drivers having won 5, 4 and 3 times respectively. A simple table, such as the following, would have conveyed this information more clearly

Aside form the poor use of font weight, the BBC chart has managed to get the drivers' names to align vertically, making them hard to read. Choosing a horizontal bar chart, rather than a vertical column chart, would have avoided this. For example:

I think the graph above is entirely clear. But if the BBC needed to add some more visual interest, they could have added further embellishments, such as using a pictograph:

This is quite easy to do with Excel. The example above took about ten minutes, most of which was creating a suitable image file. With a little more time and care it could be made much more aesthetically pleasing, although for some reason these types of graphs always make me think of the 1950s.

Perhaps later today the BBC may update their graphic - maybe on aesthetic grounds - but preferably also because the data will have changed!

Sunday, 3 July 2016

Creating non-standard charts in Excel: shadow bars

There are types of chart which are not offered as part of Excel's standard library but which can be produced with a little imagination and perseverance, This short article explains how to produce a "shadow bar" chart such as in the following illustration:

In dashboards, this type of chart might be useful for summarising a "balanced scorecard" where individual indicators are grouped under thematic headings. Current performance is shown in the RAG total bars. Muted colours are then used to show the previous values so that relative change can be gauged quickly

Although not offered as part of the standard set of templates, this type of graph can be produced in Excel. One way to is to produce two separate charts, make one transparent and position it carefully over the other.

Step-by-step

Set out the data (noting the layout of rows and columns):

Highlight the first block ("Latest") and insert 100% Stacked Bar graph . Depending on Excel version and settings, this will produce something like the following (Excel 2016):

Make the following changes (exactly how will vary between Excel versions):

Get rid of the Chart Title
Get rid of the Legend
Get rid of Horizontal Axis Labels
Get rid of the Horizontal Axis Gridlines
Get rid of Chart Border line
Increase the font size of the series labels
Format Data Series Colours to Red,Orange and Green to better reflect RAG

You will have produced something similar to the following:

Now create a copy of this entire graph. This will be used for the "Previous" shadow values. Working on the copy, make the following changes:

Change the Data Source to the alternate data set
Format Data Series Colours to muted shadow colours (greys have been used here)
Format the Colour of the Series labels to White (or whatever other background colour)

You will have produced something similar to the following:

Next, change the Chart Area of the original ("Current") graph to No Fill. It can now be positioned over the top of the shadow graph and the positions fine tuned to given the desired effect.

Various refinements and embellishments can be added but beware of making the final product too "busy".

Monday, 30 May 2016

What do we mean by “dashboards”?

The word “dashboard” originally referred to the board at the front of a horse drawn carriage which shielded the occupants from mud spattered (“dashed”) up by the horses’ hooves.

The dashboard was retained in horseless carriages. It became a convenient place to fix instruments so developed a new meaning as an alternative to “instrument panel”

The word “dashboard” was adopted by business information specialists somewhere around the early 1990s through the analogy between a car dashboard and a collection of “controls” for monitoring information.

The information use of the term “dashboard” became increasingly widespread from around the late 1990s. Many system suppliers took the car analogy a little too literally. Developers tried to recreate car dashboard visual characteristics in information dashboard products. Some of these products were visually stunning whilst at the same time being relatively ineffective for their supposed primary purpose.

Such excesses have been reeled back in slightly by the efforts of visualization gurus such as Edward Tufte and Stephen Few

We are now at a point where widespread over-use of the word “dashboard” has almost stripped it of any meaning

Use of terminology varies but the following concepts can be distinguished:

A single screen (or page of paper) which summarises at a glance the current position of one or more key measures for an organization or enterprise: I call this a dashboard

A more extensive collation, perhaps showing trends or a detailed breakdown of one or more key measures for an organisation or enterprise, and quite likely including commentary: I call this a report

A collection of the key indicators for an organisation, sub-organisation or enterprise: I call this an indictor set

One or more overall scores can be collated to provide a summary of an indicator set: I call this a scorecard (the term “balanced scorecard” is widely used but for me to agree to use it I would need to have it explained how the balancing has been carried out)

A dashboard or report in which the user can obtain different versions of the information by entering choices: I call this an interactive dashboard or an interactive report

Monday, 28 March 2016

A four-layer design for an Excel-based dashboard

When producing dashboards and reports in Excel it is tempting to build them as a single entity. There are good reasons to take a more modular approach. The modules can be visualised as a series of connected layers;

In their book Excel Dashboards and Reports, Michael.Alexander and John Walkenbach recommend a data model with three layers: data, analysis and presentation.The "analysis" layer is seen as a staging area.

The four level model set out here is similar, but a separation is made been calculation and collation.

Where possible it is best to avoid having any data calculations in the Presentation Layer, This allows alternative versions of the Presentation to use the same data with limited risk of introducing inconsistencies.

Calculations are made as close as possible to the source Data

Although the data model has four layers, this does not necessarily mean that it is constructed in four different files. The physical design will depend upon the size and complexity of the dashboard. It would be possible to have all four layers within a single Excel file.

Bibliography

Dashboards for Excel. by Jordan Goldmeier and Purnachandra Duggirala. Kindle Edition. APress 2015

Excel Dashboards and Reports, Michael.Alexander and John Walkenbach .

Tuesday, 1 March 2016

The Pie Chart and levels of consciousness

I have been through a succession of stages of understanding and enlightenment in respect of the pie chart. These can be seen as a progression through levels of consciousness (how pretentious does that sound?). Anybody for whom a discussion of pie charts would quickly lead to a state of unconsciousness, stop reading now - save this for your next bought of insomnia

First Level
I realised that the name referred to pie the food not to pi the strange number with lots of decimal places. Maybe I was unique in mistakenly assuming that the name had something to do with the properties of a circle.

Second Level

I realised that pie charts were a pain to draw. I was brought up in the Lower Palaeolithic, before the advent of spreadsheets. At school, drawing pie charts involved compasses (to draw circles) and protractors (to measure out angles). When I worked as a professional illustrator, I had to do all this but also very neatly and using ink. Differentiating the different slices involved very carefully ruled hatching, precise stipple, or sticking on pieces of expensive patterned film ("Letratone") cut precisely. In those days a spreadsheet was a major undertaking. You had to be really sure you wanted one.

Third Level
Spreadsheets. Along came Loutus 1-2-3 and then Excel. I was delighted to find that Excel can produce very quickly. Not only that, it offered aesthetically pleasing variations such as 3D and "exploding pie". I felt very grateful for the existence of computers. I recreated Manhatten using 3 D column charts.

Fourth Level
Having read the works of the data visualisation gurus - I realised, with a bit of an embarrassed sinking feeling, that I was supposed to deprecate pie charts. I realised that the exploding 3D charts made me look like a novice rather than an expert. I stopped using pie charts immediately. I scoffed at other people who still used pie charts and I felt superior.

Fifth Level
I realised that the pie chart can be good after all - if used in the right way. I felt stupid for not holding onto this when I jumped smugly to Level Four.

Sixth Level
It's a pie chart...but not as we know it. I then realised that there are other kinds of charts which look like pie chart but are really something else. I could not see any missiles directed at these from the data visulaistaion gurus, so concluded that they may be safe to use

Seventh Level
Is there a seventh level or beyond? I have not got there yet. I will let you know

Some Books

I have stumbled about in the information arena for many years. Every now and then a particularly noteworthy book comes to my attention. Here are a few:

How to Lie with Statistics by Darrell Huff. Originally published in 1954, with illustrations by Irving Geis. A revised edition was published by Penguin Books (1973) with illustrations by Mel Calman. This was one of the first books I ever bought and is still excellent today
http://www.amazon.co.uk/How-Lie-Statistics-Penguin-Business/dp/0140136290

Meaningful Graphs: Converting Data into Informative Excel Charts by James M. Smith (2014). I have seen a lot of books on Excel- and owned a quite few over the years. Many are bloated and not very interesting. This one is very good - and one of the few I've felt worth buying since I discovered good online resources, such as Chandoo.org.

http://www.amazon.co.uk/Meaningful-Graphs-Converting-Informative-Charts/dp/0986054909

Dashboards for Excel by Jordan Goldmeier and Purnachandra Duggirala (2015). Apress. A really useful book - it has tips and tricks that would take you years to discover for yourself
https://www.amazon.co.uk/Dashboards-Excel-Jordan-Goldmeier/dp/1430249447

Storytelling with Data: A Data Visualization Guide for Business Professionals by Cole Nussbaumer Knafic. Kindle Books
A really good book. It looks at systematically at the different aspects of communicating using graphs. Because I like the book, I can forgive the author for her overuse of the word "leveraging"

Some Books by Edward Tufte
http://www.edwardtufte.com/tufte/ebooks
Edward Tufte introduced the term “chartjunk” and the concept of data-ink ratio in his first book, The Visual Display of Quantitative Information (1983). The second edition was published in 2001 (ISBN 0-9613921-4-2).

His second book, Envisioning Information was published in 1990 Graphics Press, ISBN 0-9613921-1-8.(2001)

Visual and Statistical Thinking: Displays of Evidence for Making Decisions (1997) provides two fascinating case studies: the 1854 cholera epidemic and the 1986 Challenger space shuttle disaster

The Cognitive Style of PowerPoint: Pitching Out Corrupts Within (2006).

So that's two of us who don't like Powerpoint! Hooray!

Visual Explanations: Images and Quantities, Evidence and Narrative
Graphics Press (1997) ISBN 0-9613921-2-6.

Beautiful Evidence. Graphics Press, ISBN 0-9613921-7-7. 2006

Some Books by Stephen Few
If you are interested in data visualisation then it is just a matter of time before you encounter Stephen Few. As well as the books listed below, he also publishes a lot of very good short articles, available free, on his website at www.perceptualedge.com

Now You See It: Simple Visualization Techniques for Quantitative Analysis, Stephen Few, Analytics Press, 2009

Show Me the Numbers: Designing Tables and Graphs to Enlighten, Second Edition, Stephen Few, Analytics Press, 2012

Information Dashboard Design: Displaying data for at-a-glance monitoring, Second Edition, Stephen Few, Analytics Press, 2013. Although this book is now in its second edition, I only have the first edition at present.

Signal: Understanding What Matters in a World of Noise, Stephen Few, Analytics Press, 2015

Some Books to avoid?
There are some books in which I have been particularly disappointed. At the risk of upsetting their authors, the following are books which I would definitely not recommend:

Using Charts to Create Effective Dashboards by T.Doyle
Having got this electronically to read on a Kindle, I was dispapointed to find that this book is very short and is barely more than a list of types of chart. There is little on dashboard design at all. Most of the samples it provides are good examples of how not to use charts. This book seems even to have vanished from the Amazon Kindle store now, making my copy an endangered species - soon to be deleted.

MS Excel as a Graphics Design Tool by Clarence Bjorkmann
This was an interesting concept but in the end I was left wondering why on earth anybody would want to use Excel in this way when even Windows Paint could produce better results and there are freeware graphics tools available. Furthermore, the graphic design in the book itself is not impressive. £3.47 I wish I'd spent differently.

Tuesday, 9 February 2016

Losing your RAG

Variations of the Red-Amber-Green "RAG" performance rating system are used extensively.

A common version is really a two rating (Red-Green) system i.e. where all performance is marked as either pass or fail. When over-used on a table of numbers (not least with the common practice of colouring whole cells), this can produce a confusing and garish grid.

A proper three-grade RAG system, including the central Amber, allows more sophisticated ratings to be made. But the Amber rating can be used in two very different ways:

Fail but only just
Pass but close to fail

Either way, including Amber in the pattern of fully coloured cells just adds further to the over-whelming blast of colour that assaults the reader's brain.

So why, so often, do we end up with these types of reports and dashboards?

The whole cell colour aspect is probably mainly down to ease of production. It is much easier to produce this in say, Excel, than to use less over-whelming devices.

The extensive use of Green is more curious. It probably indicates a service area where performance has regularly been below target and the people concerned are feeling a bit battered. Lots of Red is demoralising so every opportunity is taken to proclaim the Green. The same motivation will probably lead to adopting the Amber "fail but only just" option. It removes more Red and it gives a feeling of "nearly there".

But before jumping into producing another one of these grids, the analyst should consider what the report or dashboard is actually trying to communicate. In the end, the main purpose has to be to highlight to managers when they need to make a decision or initiate action.

This being so, what is the purpose of the Green? There are lots of months where the target was achieved. So what? What action will we take? Apart, possibly, from a momentary congratulation - none. The Green is where we want to be. Colouring it in may give a sense of assurance, but it does not trigger any management action.

The Red rating is more obvious: we have not achieved the target and we need to do something.

What about Amber? A "fail but only just" statement is still a fail. We still need to do something. What we really need to know is where we met the target (Green) but only just. This is a situation where things may deteriorate and become Red. We may need to take action soon, or be pro-active and take action now, so we need to highlight this.

These considerations suggest that what would be best is a "RA" rating i.e. Red (fail) and Amber (pass but only just) system. Leave everything else un-highlighted on the grounds that there is no need to draw attention to it.

An example could look something like

where Red indicates that we missed the target and Amber indicates that we achieved the target but it was so close to the borderline that we may need to act, or at least keep a close eye on it.

This type of Red-Amber rating system can be developed further to make the Amber look more like a faint Red i.e to use a monochrome (Red-red) system in which tone conveys the meaning. Using this approach, together with some further adjustments to contrast, gives the following alternative:

This illustration uses exactly the same data as the grid above but is set out in a very different way. The idea is to adjust the contrast so that important information stands out and less important information melts into the background.

Attention is drawn to the latest position rather than to the history of previous results. Indicator 4, which missed the target is clearly highlighted by the Red marker. Two other indicators (5 and 7), which only just hit the target, are also highlighted, but in a lighter tone.

The compact visual summary of the last six results then provides additional detail. Indicator 5 has generally been ok so there is probably no need to over-react. Indicator 7 has a history of missing the target. We should continue to monitor it closely.

In conclusion, for dashboards and reports, it may often make sense to abandon the customary RAG systems in favour of a Red-Amber (RA) or Red-red (Rr) as described here.

Finally, another practical reason to abandon the RAG is to help the 10% of males, and 1% of females, who are colour blind.

References

3 Problems With Traditional KPI Traffic Lights by Stacey Barr. 20/05/2014

(link)

Dashboard Design for Real-Time Situation Awareness by Stephen Few. 2007 (link)

Excel: creating a 'Last Ten Results' indicator panel without using VBA by Edgar Bolton. B I Dashboard Design. 31/01/2016 (link)

Monday, 1 February 2016

Why month-based views are unsuitable for data which exhibits weekly patterns

One of the most commonly encountered formats for presenting data is a table of monthly totals, often accompanied by a graph.

One reason for the extensive use of this approach is that it is very familiar and is practically self-perpetuating. It almost suggests itself.

Another reason is that it is a relatively easy calculation to produce month based counts using commonly available tools such as Excel, Access, SQL. Producing weekly aggregate numbers from raw data is a slightly higher level of difficulty. Similarly, producing a graph in Excel from a table of monthly totals is almost effortless. Producing an SPC (Statistical Process Control) type run chart is a little less easy.

A monthly review makes perfectly good sense. It is a reasonable time interval against which to assess a progress. Much less than a month and we risk over-reacting to short term fluctuations; much more than a month and we may not give ourselves enough time to act.

Where things can go wrong is where a good time interval for review gets confused with a good time interval for analysis.

Unfortunately there are many situations where a month-by-month view masks, rather than highlights, the real patterns in the data. People presented with a month by month graph in which the numbers go up and down will feel that they are seeing the real changes in activity. They may be reassured if the numbers go the 'right' way and panic if the numbers seem to go in the opposite direction.

The following illustration shows a series of monthly counts of events:

The monthly totals time series shows numbers going up and down suggesting a pattern of changing activity.

Most readers will quickly spot that the pattern in the graph is nothing more than a representation of the number of days in each month. It has been built up from exactly 10 events each day throughout the entire period being studied. Nothing is changing at all.

The appearance of changing activity has been exaggerated by the default behaviour of Excel in truncating the vertical axis when creating graphs.

A way of compensating for this is to divide the total number of events in the month by the number of days in the month i.e. to calculate a daily rate. This is added on the illustration above. It shows that the monthly pattern of activity is actually static.

For analysis, the rule should be that where the period being studied is divided up into unequal time periods (such as months), it is better to use a rate than to use a simple period total. It is surprising how often this simple rule is not followed.

In many situations, this compensating technique (translation into rates) may not be sufficient. The next illustration shows a series of monthly counts of events. The daily rate has been calculated. The rate goes up and down, suggesting change:

However, here again, the impression of a changing levels of activity is an illusion.

The underlying pattern of activity is really static. It is a constant 28 events a week throughout the study period. For the purposes of illustration, the weekly pattern synthesised for the illustration has been strongly polarised (All Mondays have 7 events, Tuesdays 6 events, etc down to Sundays with 1 event).

The spurious pattern in the illustration results from the interplay between the consistent but uneven distribution within each week in combination with the disposition of weeks across month end boundaries.

Another regularly used 'compensatory' technique is to show a monthly total alongside the equivalent month from the previous year:

Surely this approach is valid as, with the exception of some Februaries, we are comparing the activity in exactly equivalent time periods, side by side?

Unfortunately, again, the appearance of changing activity is an illusion. The graph is derived from exactly the same static pattern of 28 events per week as was used in the previous illustration.

If this is what static data looks like through this framework, imagine how hard it would be to make sense of any real patterns of change.

In areas where there are strong weekly patterns, such as healthcare, viewing data through the distortion of a monthly framework makes it very difficult to see what is really going on.

So while monthly views work well for appropriate uses, such as showing cumulative income year-to-date, they will often be a poor choice for displaying and understanding patterns of event activity.

Unlike the month, the week is a more 'natural' time period which aligns well with patterns of behaviour. It is worth the extra effort of aggregating event data by week and looking at it in weekly run charts. The SPC method can be used to assess whether the variation seen reflects any real changes.

At the very least, a preliminary analysis will be needed to assess the strength of weekly patterns in the data before a decision can be made on how much 'noise' this is likely to create in a monthly time series .

Another technique used in reports and dashboards is to divide monthly totals of events by the number of 'working days' in a month. This can go some way towards making comparisons better. But it still does not compensate fully for a strong weekly pattern.

Also, it would be necessary to review each measure being reported on and assign an individual number of working days to it. This can vary from measure to measure. Some events would be expected to occur on weekends and bank holidays. Some services may operate for less than five days a week. It cannot be assumed that using the number of 'working days' worked by the analyst in the month is the correct denominator in every, or even any, case.

Sunday, 31 January 2016

Excel: adding interactive drill down to a 'Last Ten Results' indicator panel

A previous article outlined a simple method for creating a 'Last Ten Results' indicator panel in Excel (see link). This supplement shows how to add interactive drill down functionality (Excel skill level: advanced)

Another previous article had shown how to use Excel comments boxes dynamically to provide drill down details (see link). This method can be adapted to work with the last ten results indicator panel. There are a few technical challenges.

Reversing a linking formula
The cells coloured green, red and grey in the indicator panel read values from a separate data area. The way the data is structured (in this example) the drill-down needs to read from three adjacent rows in the data area, e.g. cells AF10, AF11, AF12. This is trickier than it first seems.

Some VBA code which can do this set out below. It is triggered by the Worksheet Selection Change event i.e.the code is run each time a different area on the worksheet is selected. So the code is run when any of the coloured bars are clicked on, displaying the appropriate details for the cell It is run again when any non bar cell is clicked on, removing the previously displayed details.

To make the code a little easier to follow it has been split into three blocks. The first block defines the memory variables that will be needed, clears any pre-existing comments from the worksheet, tests to see if the selected areas is withing the defined target area (i.e the whole indicator panel), and tests to see if it is a coloured results cell or simply the margins around the coloured results cell (requiring no drill down)

The next block of code records the coordinates of the cell where we want the drill down to be displayed and works out the coordinates of the topmost data cell relating to it

The final block collates the details from the data area into a cell comment and makes it visible. The details are produced by concatenating all the individual pieces, and using Chr(10) to force new lines within the comments box.

Inevitably, this VBA code could be written in a more compact and slick manner. But it works as is and runs quickly enough. Also note that the method of 'reversing' the link to identify the data cell is designed for cases where both sections are on the same worksheet. It would need further refinement if the data was located remotely

References

Excel: creating a 'Last Ten Results' indicator panel without using VBA by Edgar Bolton.
B I Dashboard Design. 31/01/2016 (link)

Excel: Using cell comments to provide "drill-down" details by Edgar Bolton
B I Dashboard Design. 05/08/2015 (link)

Excel: Using arrows to indicate direction of change in performance by Edgar Bolton
B I Dashboard Design. 26/07/2015 (link)

How to convert Excel column numbers into alphabetical characters. Microsoft. Article ID: 833402 - Last Review: 09/19/2011 00:01:00 - Revision: 5.0 (link)

Thursday, 28 January 2016

Excel: creating a 'Last Ten Results' indicator panel without using VBA

The BBC website has many good examples of how to display information. The following device is used to summarise a series of results.

The BBC use it to display football results in a League Table (see link). The concept is used more widely in Business intelligence dashboards. There are various approaches to creating this effect. This short article shows a simple way of constructing it using Excel without the need for any VBA coding (Excel skill level: Intermediate). The main idea behind the approach is to use an entire spreadsheet cell for each coloured bar. The colours are controlled by Conditional Formatting.

Step One
Set up a data area. Here, a block of ten successive results by the football team Leicester City F.C. is used. In the illustration, the data is set up on the same worksheet as the dashboard panel. In a real application, the data would probably be held separately.

Step Two
Each set of bars requires three spreadsheet rows. The middle row is for the coloured bar; the rows either side are clear margins. Similarly, additional columns are needed between each of the bar cells. Mark out the grid area by filling each bar cell with a temporary grey background. Then link each bar cell to the corresponding results in the data area.

Step Three
Add Conditional Formatting to the bars. This can be done to the full range as a single block. Three rules will be needed, one for each colour: In the example, "Win" is to display as Green; "Draw" as Grey and "Loss" as Red. Set the formats so that the font and the background colours are identical, within each rule.

Step Four
If desired, add a light background tint to the whole panel (and set the font colour exactly the same)

Step Five
Shrink the grid by setting column width and row height properties. In the example the following values are used: Bar row 3, margin rows 6, bar columns 1, interval columns 0.17. It may help to zoom to 200% while doing this.

Step Six
To add normal rows of text alongside the bars, merge the three rows in the area adjacent to the bars. This re-creates a more normal row height into which text can be added. When all the row and column adjustments have been made, switch off the grid-lines on the worksheet

It might seem fiddly to set up, but once one line has been created it can be cloned easily. Simply copy and then paste blocks of three rows at a time.

Enhancements
An additional feature on the BBC website displays the details of each result when the cursor hovers over each coloured bar. A way of adding this type of functionality in Excel (using VBA) is set out in a supplementary article (see link)

References

Excel: adding interactive drill down to a 'Last Ten Results' indicator panel by Edgar Bolton. B I Dashboard Design. 31/01/2016 (link)

Excel: Using cell comments to provide "drill-down" details by Edgar Bolton
B I Dashboard Design. 05/08/2015 (link)

Excel: Using arrows to indicate direction of change in performance by Edgar Bolton
B I Dashboard Design. 26/07/2015 (link)