Category Archives: Data visualisation

How to estimate uncertain data

Data Estimator is a tool that helps answer questions about uncertain quantities, eg. “What will our company’s sales be next year?”

It is designed to be used as part of an interview process, where expert judgements are drawn out and quantified.

It’s a reminder that in this world of big data, some things remain hard to measure, especially when it comes to the future.

You will be asked a handful of questions, using “probability wheels” like this to visualise the uncertainties:

probability wheel

When you’re done, you will be able to see and export the resulting probability distribution. For example, the result could be “there is a 60% chance this market will more than double in five years, a 20% chance it will more than treble, but a 10% chance it will shrink.”

probability density

It will also show some alternatives for you how you could place the uncertainty in a decision tree, eg.

decision tree node

Try it out here:


Curve fitting with javascript & d3

If javascript is up to amazing animations and visualizations with d3, maybe it’s up to non-linear curve fitting too, right?

Something like this, perhaps:

Here’s how I did it:

  • The hard work is done using Cobyla (“Constrained Optimization BY Linear Approximation”), which Anders Gustafsson ported to Java and Reinhard Oldenburg ported to Javascript, as per this stackoverflow post.
    Cobyla minimises a non-linear objective function subject to constraints.
  • The demo and its components (including cobyla) use the module pattern, which has the advantage of keeping the global namespace uncluttered.
  • To adapt Cobyla to this curve fitting problem, I wrote a short wrapper which is added onto the cobyla module as cobyla.nlFit(data, fitFn, start, min, max, constraints, solverParams). This function minimises the sum of squared differences (y1^2-y2^2) between the data points, (x,y1), and the fitted points, (x,y2).
  • The Weibull cumulative distribution function (CDF), inverse CDF and mean are defined in the “distribution” module. Thus distribution.weibull([2,1,5]) .inverseCdf(0.5) gives the median (50th percentile) of a Weibull distribution with shape parameter 2, scale parameter 1 and location parameter 5.
  • The chart is built with d3. I am building an open-source library of a few useful chart types, d3elements, which are added onto the d3 module as d3.elts. This one is called d3.elts.xyChart.
  • So the user interface doesn’t get jammed, I use a javascript web worker to calculate the curve fit. I was surprised how easy it was to set this up.
  • I apologise in advance that this sample code is quite complicated. If you see ways to make it simpler, please let me know.
  • Finally, this may be obvious, but I like the rigour that a tool like jshint imposes. Hence the odd comment at the top of fitdemo.js, /* global fitdemo: true, jQuery, d3, _, distribution, cobyla */

Check out the source code on bitbucket here. You can see it being used for uncertain data estimation here.

Please let me know what you think!


D3 Time Series Chart with Zoom & Notes

This is an example of a reusable chart built using d3. The range (zoom) slider and the notes panel are also built in d3, as separate widgets, so they can be customized further.

Check the sample source code on bitbucket for the full description of how to use it; here is the essence (without notes):

    var rangeWidget = d3.elts.startEndSlider().minRange(365*24*3600*1000);
    var tsChart = d3.elts.timeSeriesChart().rangeWidget(rangeWidget);
    d3.csv('data.csv', function(data) {
        tsData =, function(d) {return [, d.price]});"#chart").datum(tsData).call(tsChart);

To add notes to this, use:

    var rangeWidget = d3.elts.startEndSlider().minRange(365*24*3600*1000);
    var clickPanel = d3.elts.makeClickPanel();
    var tsChart = d3.elts.timeSeriesChart().rangeWidget(rangeWidget);
    tsChart.notesMarkerClick(function(elt, note, closer) {
        clickPanel(elt, note && ("<h3>"+note.title+"</h3><p>"+note.desc+"</p>")), closer);
    d3.csv('data.csv', function(data) {
        d3.csv('wheatNotes.csv', function(notes) {
            tsData =, function(d) {return [, d.price]});

As an alternative approach, check out this block from Mike Bostock for a way to pan and zoom using d3′s brush control.

Hope you find it useful!


Visualising Flows in a D3 Chord Diagram with Hover

This is an example of a reusable chart built using d3.

The idea is that you have a matrix of the flows between one category (here, optimist/neutral/pessimist) to another (introvert/extrovert). `d3.elts.flowChord()` then converts this matrix into a chord diagram, with the option of hover text.

Check the sample source code on bitbucket for the full description of how to use it; here is the essence:

  var colors = d3.scale.ordinal().range(["#AAA", "steelblue", "green", "orange", "brown"]);
  var hoverHtml = {'Introvert': '<h1>Introverts</h1>Like to be by themselves', 
      'Extrovert': '<h1>Extroverts</h1>Like the company of other people', 
      'Optimist': '<h1>Optimists</h1>Look on the bright side of life',
      'Neutral': '<h1>Neutrals</h1>Life could be good, it could be bad',
      'Pessimist': '<h1>Pessimists</h1>See the glass half empty'}
  var chordDiagram = d3.elts.flowChord().colors(colors).hoverHtml(hoverHtml).rimWidth(30);
  var data = [['Disposition','Optimist','Neutral','Pessimist'],
              ['Introvert', 0.8, 0.4, 0.67], 
              ['Extrovert', 0.2, 0.6, 0.33]]"#flow").datum(data).call(chordDiagram);

D3 bar chart with zoom & hover

This is an example of a reusable chart built using d3. The range (zoom) slider is built in d3 too, as a separate widget, so it can be customized.

Check the sample source code on bitbucket for the full description of how to use it; here is the essence:

    var rangeWidget = d3.elts.startEndSlider().minRange(30);
    var myChart = d3.elts.barChart().rangeWidget(rangeWidget);
    myChart.mouseOver(function(el, d) { showHover(el, d) });
    myChart.mouseOut(function(el, d) { hideHover(el, d) });
    d3.csv('data.csv', function(data) {
        data =, function(d) {return [,d.price]});"#chart").datum(data).call(myChart);

Make an animated & reusable barchart

Do you need a dynamic bar chart like this in your web page, with positive and negative values, and categories along the x-axis?

Dog breeds are on my mind at the moment as we just bought a new Sheltie puppy – this chart might show a person’s scores for each breed. Click the button above to see the next person’s (random) set of scores.

This is an example of a reusable chart built using d3. Using it is fairly simple, eg.:

<script src=""></script>
<script src="barChart.js"></script>
    var points = [['Beagle',-10],
    var myChart = d3.elts.barChart().width(300);"body").datum(points).call(myChart);

Please check out the source code for barChart.js on bitbucket.

You may also find this helpful even if you don’t need a barchart, but want to understand how to build a reusable chart. I was inspired when I read Mike Bostock’s exposition, Towards reusable charts, but I found it took some work to get my head around how to do it for real – so I hope this example may help others too.

The key tricks were:

  • How to adjust Mike Bostock’s sample code to produce bars instead of a line and area
  • Where the enter(), update and exit() fit in (answer: they are internal to the reusable function)
  • How to call it, including whether to use data vs datum (answer: see code snippet above)
  • How to refer to the x and y coordinates inside the function (answer: the function maps them to d[0] and d[1] regardless of how they started out)

You can find a much fancier version of this chart, with a range slider and hover text, in this post.

Good luck!


How is your tax money being spent?

Want to know how your tax money is being spent? The Australian Budget Explorer is an interactive way to explore spending by portfolio, agency, program or even in more detail; compare 2014 against previous years; and search for the terms that interest you.

Australian Budget Explorer

This was produced in collaboration with BudgetAus. This year for the first time, the team at provided the Budget expenditure data in a single spreadsheet, which Rosie Williams (InfoAus) manipulated to include further data on the Social Services portfolio. The collaboration is producing lots of good visualisations, collected at AusViz.

I won’t editorialise about the Budget here; instead here is my data and extensions wishlist:

Look-through to include State budgets

The biggest line item (component) in the Federal Budget is $54 billion for “Administered expenses: Special appropriation GST Revenue Entitlements – Federal Financial Relations Act 2009″, which I take it is revenue to the States. I would love to be able to “look through” this item into how the States spend it.

The BudgetAus team has provided some promising data leads here.

Unique identifiers to track spending over time

One of the most frequent requests I get is to track changes in spending over time.

Unfortunately this is hard, as there are no unique identifiers for a given portfolio, program, agency or component. That means if the name changes from one year to the next, it is hard to work out which old name corresponds to which new name. E.g. In 2014, the Department of Employment & Workplace Relations has been split into the Department of Employment and the Department of Education, while the Environment portfolio used to be “Sustainability, Environment, Water, Population and Communities”.

It would be great to give all spending an identifier, and have a record of how identifiers map from one year to the next.

What money is actually spent?

How does the budget relate to what is spent? There is some info here at BudgetAus, but the upshot is “This might be a good task for a future group of volunteers”…


There is revenue data available here – I haven’t looked at it carefully yet, but I hope to include it, if possible.

Cross-country comparison

It would be great to compare the percentages spent in key areas by governments across the world.
Maybe it’s already being done? To do this I’d need some standard hierarchy of categories (health, education, defence, and subdivisions of these, etc), and we’d need every country’s government (and every State government) to tag their spending by those categories. Sounds simple in concept but I bet it would be hard to make it happen.

In the meantime, my plan is to check quandl for data and see how far I can go with what’s there…


Finally, many thanks to the authors for the awesome d3 package!


If you have any comments or know how to solve any of the data issues raised above, please let me know.