Tuesday, 7 November 2017

Doubled bar chart labels in Excel

The following bar chart was published on the BBC website

























Not withstanding the somewhat gruesome subject matter, my interest was caught by the labelling of the categories. The labels had both the location and the year, with the year in bold and on a second line.

I wondered whether it was possible to achieve this effect using Excel

Forcing the year onto a fresh line can be achieved by inserting a "hard" carriage return. (ALT and ENTER keys at the same time)

Jon Peltier provides a method of adding and then formatting the category labels using a second (dummy) data series. https://peltiertech.com/individual-format-category-axis-labels/. The alignment of the labels can be set to right




I have not yet found a way to selectively apply Bold formatting to the text in the label



Sunday, 5 November 2017

Creating a Fan Chart in Excel



Mary Eleanor Spears provides an example of a ‘Fan Chart’ in her book 'Charting Statistics’ (McGraw Hill Book Company. 1952):
























The Fan Chart is in effect a variation of the Slope Chart. It shows relative change rather than absolute change. So all points on the Left Hand side start on zero.

A version can be produced relatively easily in Excel:






















This version was made from the following starting data:




















The main steps were:

Create a basic line chart from the data
Select Data Source: Switch Row/Column
Delete legend
Delete header
X Axis: set  label position high
Format each data series line colour and thickness
Format Y axis tick marks and line
Format Y axis so that X axis crosses at zero
Format Y axis maximum value to 10
Delete all gridlines
Set X axis position on tick marks
Add data labels (to right)
Delete all LH data labels (carefully so as not to delete RH at same time)
Format each RH label with

  • Series name not value
  • No leader lines
  • Position to Right side

One way to add the shading outside of the data range is to create a second chart and overlay the fan chart. Create a clone of the chart and turn that into an area chart, being careful not to change the size at all. Turn the plot area of the clone grey and turn the fill of the wedges within the data area white

Then copy the line version of the fan chart and position it over the area version (making the top chart see-through by use of 'No Fill') settings). Here is a screenshot of the effect:






Thursday, 2 November 2017

Slope Graphs

The slope graph - or slope chart- is a special type of line chart. It is particularly useful for showing multiple  ‘before and after’ comparisons. It may be either based on rankings or on actual scores.

The slope graph is often said to have been devised by Edward Tufte (he published an example in his ground breaking 1983 book 'The Visual Display of Quantitative Information'). Tufte developed and promoted the idea but did not 'invent' it as such.

A slope graph was published in another book on the visual display of quantitative Information,  69 years before Tufte:




(part of Figure 63, p.65, from 'Graphic Methods for Presenting Facts' 1914 by Willard C Brinton - revised edition 1919)

Another example, more modern and also pre-Tufte, was published by Mary Eleanor Spears in her book 'Practical Charting Techniques'.(McGraw Hill. 1969). She describes it as a Ranking Chart:







Unlike the busy 1914 example, focusing on the change between just two points allows us to read easily. The gradient of the connecting line quickly gives both direction and magnitude. This is a considerable advantage over illustrating with a simple table


Edward Tufte has posted more examples on his website: https://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0003nk&topic_id=1 including possibly the earliest ever slope graph (from  Scribner's Statistical Atlas of the United States,1883)

Both Jon Peltier (https://peltiertech.com/slope-graphs-in-excel/) and Stephanie Evergreen (http://stephanieevergreen.com/slopegraph/) have provided step by step details of how to create a slope chart in Excel. The latter is clearer, and is also covered her book 'Effective Data Visualization: The Right Chart for the Right Data'.

Using these instructions, together with a tip from Chandoo, enables us to create an Excel version of Mary Eleanor Spears 1960s original:




Producing this chart in Excel was slow, but not difficult. The labels on the left-hand side were formatted using Stephanie Evergreen's method (making Series and Value visible then omitting the comma between the two).


The following data layout was used to create the Slope Chart in Excel:

























Column D is used solely for the Right-Hand labels. It is not included in the ‘Select Data’ range

I could not find a way to reverse the order of Value and Series using Stephanie Evergreen's method. Chandoo pointed out that it is possible to set the value of individual labels using a formula. So, the solution for the right-hand labels is to create the label in another spreadsheet cell and then point each label to the specific cell

This method is quicker than the Evergreen method. Unfortunately, it does not work for the Left-Hand labels – if you want to reuse the chart with refreshed data - as the order is volatile.

A final step might be to format one feature a different colour or intensity to highlight it.

The Slope Chart allows a lot more series to be compared in a single chart than would be sensible with a regular line chart. But what is the upper limit for a Slope Chart? That will depend on the way that you want to use it. Take the following example which compares 44 series in a single chart:  



 From a conventional dashboard perspective, this is now too much to fit with ease on a single screen. So, it will not work easily in an Excel dashboard or PowerPoint slide. But it still works if printed on paper. And it could work in a single screen if set up with a window to roll over the full chart, displaying a portion at a time.

Another approach would be to simplify by removing the labelling








Sunday, 1 October 2017

Making an 'opposing half circles' infographic in Excel

The BBC website has published an article about F1 driver Lewis Hamilton choosing a vegan diet http://www.bbc.co.uk/sport/formula1/41416402. It contains the following 'infographc' illustration:

















How easy is it to produce illustrations like this using Excel?

One way is to produce two separate pie charts and then overlap them. To do this you need to make each pie chart from two equal values - to get the exact halves. Then set the format of the 'unwanted' half (Data Point) to Fill: No fill and Border: No Line. Apply the same settings to the overall Chart Area too.


The larger side will need to be made appropriately bigger. Without measuring the original, I cannot tell if the areas of each half are in exact proportion to the two numbers (if not we will be in big trouble with the data visualisation gurus). To prove the idea (and since it is a Sunday morning and I have only had one cup of coffee) I have simply enlarged it by eye by dragging the corner of the Chart Area. It would be possible to calculate the exact chart size




To create the text blocks, simply put each element of text on a separate row, select appropriate font settings, then left or right align accordingly:










To assemble the chart, work from the left had side and position the two transparent half pies. Adjust a column width to align the right hand text block.

















Then get rid of the grid lines
















The only things to add are the white person and F1 car icons in the two halves. This is a bit of a challenge. You would need to find or create white shapes with a clear background

In Excel 2016 you can draw these on using a Pen from the Draw menu (set ink to white)











It is probably easiest to draw with the View set to a high Zoom

















If you set Convert Ink to Shape then you can create movable and resizable objects.

OK, I think my man would probably benefit from Lewis Hamilton's vegan diet....but with a little more time we could produce anything in this way - should you like this kind of chart




Thursday, 23 March 2017

Familiarity breeds contempt: the pie chart

William Playfair is generally credited with inventing the pie chart, publishing a variety in 1801 [See Note 1]. The following is a much used illustration, taken from a larger more complex chart:




















The name “pie chart”, which may date back to the 1920s, derives from the resemblance of the chart to a pie divided into portions. In France it became known as “le camembert” through its resemblance to the shape of the cheese.

The pie chart requires segments (“slices”) to be constructed exactly. The angle of each segment must be calculated to reflect its proportion of the whole. Before the advent of computers,  drawing a pie chart in ink could be  tricky involving the use of a protractor and a beam compass. Fortunately, these days, most analysts will never have to attempt this.

Modern computing allows pie charts to be constructed in a matter of seconds. Excel 2016 provides the following types as standard:
  • Pie
  • Pie of pie – one segment is subdivided in a second pie chart
  • Bar of Pie – one segment is subdivided in a second chart, bar not pie
  • 3D pie – a “pleasing” 3 dimensional view
  • Doughnut (or “Donut”) – like a pie but with the middle removed
These can be used with various effects, including exploded slices. 

If its origins can be traced back over 200 years, criticism of the pie chart can be traced back over one hundred years:

“the circle with sectors is not a desirable form of presentation" [See Note 2]


Many people today, including some prominent opinion formers, deprecate the use of pie charts. This criticism is echoed widely [See Note 3]. A person newly embarking on a career as an analyst would be forgiven for developing a view that using a pie chart would be a declaration of incompetence.

Certainly there are abundant examples of pie charts being used badly.  But does it follow that every possible use of a pie chart could be replaced favourably by alternatives?


The two-slice pie chart


The readability of a pie chart generally declines as the number of slices is increased. The pie chart probably works best when there are only two slices. 

As an illustration, the following simple chart shows the result of the recent UK referendum on EU membership (the "Brexit" vote):




















This example chart was produced quickly in Excel, with a few adjustments from the default settings. While this is far from spectacular, it works as an illustration of the result and its "narrowness". The colour scheme is based on that used by the two sides in the referendum campaign, rather than being based purely on aesthetics.

It could be argued that a two-number result is a relatively simple situation so may not need an illustration at all.  But the underlying concept of a dashboard - to provide information "at a glance" - suggests that the overall result of a referendum is likely to be the most important piece of information and should therefore be given prominence. 

So how does the two-slice pie chart compare with some simple alternatives?

Alternative: Simple Free text statement

51.9% voted Leave and 48.1% voted Remain

Is this better than use of a two-slice pie chart? It conveys the information in a compact manner, but it lacks impact and emphasis. The reader has to perform a mental comparison having assimilated the numbers. 

Alternative: Free text statement with additional emphasis


51.9% voted Leave and 48.1% voted Remain
Is this better than use of a two-slice pie chart? It has more impact and emphasis but it still lacks something. It is idiosyncratic and one step towards becoming something like a  "word cloud" or "call out". The reader still has to perform a mental comparison.


Alternative: Table


Outcome Votes Percentage
Leave 17410742 51.9%
Remain 16141241 48.1%

Is this better than use of a two-slice pie chart? It is compact and provides all the information. But there is no impact or emphasis and the reader still has to perform a mental comparison.



Alternative: Simple column graph















Is this better than use of a two-slice pie chart? It provides much of the same impact and emphasis. But (as set out here at least) it lacks overall coherence. It could also be argued that the pie chart perhaps gives a better impression of the interdependence (and reciprocal "swing")  between the two parts. It it hard to argue that the column chart above is vastly superior to the two-slice pie.


Alternative: Simple bar graph











Is this better than use of a two-slice pie chart? It is really just a "tipped over" version of the simple column chart, so many of the same arguments apply. The bar chart has some definite advantages over the column chart in that the alignments make it much more readable. But, again, it is hard to argue that it significantly out-performs the two-slice pie as a device in this situation.

Alternative: Stacked "Percentage" bar graph









Is this better than use of a two-slice pie chart? This is in effect a rectangular pie chart. The two-slice pie has at least one advantage in that it automatically contains a line (the top "cut") which is aligned vertically and against which the position of the split (the second "cut") can be judged visually The bar graph only has a single line dividing the two portions so we would need to add a second line at the 50% mark to enable the same kind of visual comparison.








In my opinion, this last simple example - the stacked percentage bar graph with added line  - more or less matches the two-slice pie chart in overall effectiveness.  The pie chart is arguably a slightly better choice for instant communication, being the more familiar. The bar chart is maybe a better choice in terms of efficient use of space within a report or dashboard.

In conclusion, the two-slice pie definitely holds its own, and is arguably better than any of these a simple alternatives. Not everybody will agree. It may become a matter of stylistic choice, or simple personal preference

It may look, from my comments above,  as if I am taking a general stand in favour of the pie chart against the anti-pie masses. This is not the case. The  "beauty contest" above was looking only at the very simplest of situations: a comparison between two parts of a single whole. As the complexity of the situation increases, I believe that the pie chart quickly begins to struggle to compete.


Side-by-side comparison: two-slice pie charts


For more complex situations, first let's look at how well the two-slice pie chart fares when side by side comparisons are required:





If we want to look at the UK referendum results within the different UK countries, we can compare two side by side as in the above. I think this works as an illustration.

What about comparing three countries?





Personally, I am having to start a bit of visual gymnastics at this point. I can still read the information and make the comparisons, but it is becoming harder with my eyes having to dart backwards and forwards between different parts of the illustration

What about comparing four countries?





Well this arrangement is definitely hard work for me. It would probably be better if the four pies were arranged in a square, some thing more like:




Even adjusting the arrangement, I find this illustration far from ideal.

How does it compare with an alternative such as the stacked percentage bar chart?



In my opinion the stacked bar chart is much more effective than four side by side two-slice pie charts. It is also much quicker to set up in Excel as it is all contained within a single chart - the only complicating aspect being my choice to add a line at the 50% mark. The pie chart approach requires setting up four individual graphs and then aligning them - or their snapshot images. Subtle issues with sizing and positioning would need to be dealt with if I was using this approach "for real".




So, in conclusion, comparing two side-by-side two-slice pie charts works ok; comparing three is less good and may be done better using other approaches; comparing four is definitely less effective than alternatives such as the stacked percentage bar chart.



The three-slice pie chart



The 1801 example from William Playfair is a three-slice pie chart. Here it is again with a quick re-rendering in Excel 2016:




























The graph (which is actually part of a larger more complex illustration) shows the proportional split of the then Turkish Empire into three parts (African, European an Asian). I think that this particular three-slice pie chart works. But many three-slice pie charts will not work so well. It is a combination of features that work in favour of the Playfair chart: The "European" slice is pretty much an exact quarter and the sizes of the three slices are not similar to each other. Playfair was blissfully unaware of the conventions that Excel and generations of statisticians would impose upon his idea: by default these would tend to arrange the slices clockwise in order of decreasing size.








In my opinion the clockwise "default" version is slightly less effective than Playfair's original, due largely to the fact that it loses the near horizontal "3 o'clock" line. The chart still works because the three segments are quite different in size.

When the segments are of similar size, things get harder:
















How easy is it to tell the relative sizes of the three slices in the adjusted graph above? I think it is not easy. We would have to add the actual numbers (121%,120%,119%). How does this compare to alternatives, such as a column chart?

























Well the column chart is not particularly easy either in this situation (the insipid colours do not help), but it is possible to make out the relative sizes of the three constituents. So this type of chart does seem more effective at dealing with the same fine margins (1%).




Unlike the three-slice pie chart, which fails, the two-slice pie chart can also handle differences of around 1%:






















In conclusion, there are some situations in which a three-slice pie will work and some in which it will not work. These depend largely upon the data. This means that if a three-slice pie is used in a dashboard, it becomes unpredictable whether it will be effective or not as the data is refreshed. This would tend to make it an unreliable choice.


The verdict on simple pie charts?

I've tried to look at the merits of the pie chart afresh, ignoring the various views expressed by others. My conclusion is that there are some cases - simple two-slice pie charts and side-by-side comparisons of two two-slice pie charts - where it is an effective option. There are situations - such as three-slice pie charts when it can be effective, but it can also be ineffective depending upon the actual data used. So this makes a three- (or more) slice pie chart a risky choice in a dynamic dashboard. In one-off reports with static data, three- or more slice pie charts may be an effective option but a judgement would need to be made on a case-by-case basis. As a rule, I think I would avoid anything over three slices and I would avoid side-by-side comparisons of more than two pie charts.


Finally, the whistles and bells: pie chart variants

Apart from the "standard" pie chart, Excel offers a numbers of additional types and features. How effective are these and when should they be used (or avoided)?


(a) Pie of pie


One segment is subdivided into a second pie chart. This seems to be a messy compromise acknowledging that a pie chart struggles to be effective when the number of slices increases. But attempting to "solve" this by spilling over into a second pie chart produces a confusing overall pattern

(b) Bar of Pie


One segment is subdivided in a second chart, but as a bar not as a second pie chart. This suffers from many of the same deficiencies as the pie of pie



(c) 3D pie


A “pleasing” 3 dimensional view. These 3D charts can be made aesthetically very pleasing but distort the relative sizes of the slices, making the data even harder to read than from a 2D pie chart. These charts may therefore appeal to those making sales presentations but tend to be abhorred by analysts. In the end, the question is what the purpose of the chart is in its context


(d) Doughnut (or “Donut”)


This is like a pie chart but with the middle removed. They are very popular in commercial dashboard software. They have most of the same strengths and weaknesses as a regular pie chart but with the added difficulty of making the areas even harder to assess. So unless the objective is to save on printer ink, a straight swap of a doughnut chart for a pie chart offers little advantage

The doughnut chart can be set up in two different ways: like a pie chart with a hole in it or like an "onion"


[1] See I.Spence. No Humble Pie: The Origins and Usage of a Statistical Chart. Journal of Educational and Behavioral Statistics. Winter 2005, Vol. 30, No. 4, pp. 353–368


[2] W.C. Brinton in the book Graphic methods for presenting facts. 1914. New York.

[3] For a selection:

https://www.stevefenton.co.uk/2009/04/pie-charts-are-bad/
http://kosara.net/publications/Skau-EuroVis-2016.html
http://www.businessinsider.com/pie-charts-are-the-worst-2013-6?IR=T
http://www.storytellingwithdata.com/blog/2011/07/death-to-pie-charts
https://www.quora.com/How-and-why-are-pie-charts-considered-evil-by-data-visualization-experts
https://www.geckoboard.com/blog/pie-charts/#.WMr7svnyjtQ

https://twitter.com/EdwardTufte

 



Sunday, 12 March 2017

Using a pie chart to show abundance over 12 months

When I'm not sitting at a computer trying to figure out how to get Excel to do things, I like to take wildlife photographs. I've just bought another old bird book from a charity shop: Jim Flegg's Field Guide to the Birds of Britain and Europe (New Holland. 1990)  Amongst other good things, it has an interesting type of chart in it (see the right hand side of the scan sample below):
















A simple three scale system is used to indicate, for each month of the year, whether the bird is 

  • not likely to be seen  
  • fairly likely or 
  • highly likely

Although the chart looks superficially like a pie chart, it is more like a choropleth or heatmap. The slices are equal and never change size. They represent the twelve months of the year, running clockwise. The scale is a single-hue progression.


How easy is it to re-create this chart in Excel?

The challenge is to create a "dashboard" version of this chart such as



















in which a bird can be selected from a drop-down list with the corresponding data automatically populating the chart.


Concentrating first on how to create the chart itself: It will obviously have to be rendered as an Excel pie chart. The particular issues to resolve are:

  • How to display equal sized slices
  • How to get the month identifier letters displayed inside the slices, rather than a number
  • How to apply the differential slice tones 


The first two are relatively straight forward:

Giving each month an equal numerical value will make the slices the same size. For idiosyncratic personal reasons I have chosen to give each slice the value 30 (because 360 degrees cut into 12 slices would each be 30 degrees) but any equal number will work exactly the same in Excel.

To display the month letter (rather than the number 30) inside each slice, adjust the properties as follows

Use Format Data Labels:

  • Set Label Options to 'Category name' (not to Value, the default)
  • Set Label position to 'Inside end'


The 'tone' of the slices is set by the Fill properties. If you search the internet for ways to control this property you will generally end up programmatic solutions based on Visual Basic. This is an option.

But there is a way of achieving this without using Visual Basic. It requires setting up a Pie Chart with 36 slices.






















Each month requires three slices. These are then manually formatted to provide the fills for 'None', 'Medium' and 'High'

The data settings for each month will then determine which slices are displayed:




In the illustration, the values for each month are set in column C

The values graphed are those calculated in column G. These will be 30 if the slice matches the value selected in Column C or otherwise will be zero.

The zero value slices will in effect be omitted. The formula controlling this in Column G is

=IF(H3=VLOOKUP(E3,$B$3:$C$14,2,FALSE),30,0)

The labels are based on column F, which take the first letter of the month name, or if there is a zero value slice are blank. This is calculated through the following formula in column F

=IF(G3>0,LEFT(E3,1),"")

That is all there is to it. To use it in a "real" dashboard, we would put the graph in a separate area (or sheet)  to the calculations and then link the selected values to a data table of all birds