D3 Time Series Chart with Zoom & Notes

This is an example of a reusable chart built using d3. The range (zoom) slider and the notes panel are also built in d3, as separate widgets, so they can be customized further.

Check the sample source code on bitbucket for the full description of how to use it; here is the essence (without notes):

    var rangeWidget = d3.elts.startEndSlider().minRange(365*24*3600*1000);
    var tsChart = d3.elts.timeSeriesChart().rangeWidget(rangeWidget);
    d3.csv('data.csv', function(data) {
        tsData = _.map(data, function(d) {return [d.date, d.price]});
        d3.select("#chart").datum(tsData).call(tsChart);
    }

To add notes to this, use:

    var rangeWidget = d3.elts.startEndSlider().minRange(365*24*3600*1000);
    var clickPanel = d3.elts.makeClickPanel();
    var tsChart = d3.elts.timeSeriesChart().rangeWidget(rangeWidget);
    tsChart.notesMarkerClick(function(elt, note, closer) {
        clickPanel(elt, note && ("<h3>"+note.title+"</h3><p>"+note.desc+"</p>")), closer);
    });
    d3.csv('data.csv', function(data) {
        d3.csv('wheatNotes.csv', function(notes) {
            tsChart.notes(notes);
            tsData = _.map(data, function(d) {return [d.date, d.price]});
            d3.select("#chart").datum(tsData).call(tsChart);
        }
    }

Hope you find it useful!

  

Visualising Flows in a D3 Chord Diagram with Hover

This is an example of a reusable chart built using d3.

The idea is that you have a matrix of the flows between one category (here, optimist/neutral/pessimist) to another (introvert/extrovert). `d3.elts.flowChord()` then converts this matrix into a chord diagram, with the option of hover text.

Check the sample source code on bitbucket for the full description of how to use it; here is the essence:

  var colors = d3.scale.ordinal().range(["#AAA", "steelblue", "green", "orange", "brown"]);
  var hoverHtml = {'Introvert': '<h1>Introverts</h1>Like to be by themselves', 
      'Extrovert': '<h1>Extroverts</h1>Like the company of other people', 
      'Optimist': '<h1>Optimists</h1>Look on the bright side of life',
      'Neutral': '<h1>Neutrals</h1>Life could be good, it could be bad',
      'Pessimist': '<h1>Pessimists</h1>See the glass half empty'}
  var chordDiagram = d3.elts.flowChord().colors(colors).hoverHtml(hoverHtml).rimWidth(30);
  var data = [['Disposition','Optimist','Neutral','Pessimist'],
              ['Introvert', 0.8, 0.4, 0.67], 
              ['Extrovert', 0.2, 0.6, 0.33]]
  d3.select("#flow").datum(data).call(chordDiagram);
  

D3 bar chart with zoom & hover

This is an example of a reusable chart built using d3. The range (zoom) slider is built in d3 too, as a separate widget, so it can be customized.

Check the sample source code on bitbucket for the full description of how to use it; here is the essence:

    var rangeWidget = d3.elts.startEndSlider().minRange(30);
    var myChart = d3.elts.barChart().rangeWidget(rangeWidget);
    myChart.mouseOver(function(el, d) { showHover(el, d) });
    myChart.mouseOut(function(el, d) { hideHover(el, d) });
    d3.csv('data.csv', function(data) {
        data = _.map(data, function(d) {return [d.id,d.price]});
        d3.select("#chart").datum(data).call(myChart);
    }
  

Experience with Meteor

Meteor Day is upon us!

Here are the slides I am going to present on my experience developing with Meteor. In a nutshell -

  • Love the principles – they made it easy to develop a great webapp quickly
  • I’ve done a few cool things, I think! (like encryption, undo, admin panel with datatables, custom art:accounts-ui)
  • But… SEO is killing me (only two keywords showing!?)
  • The Paypal IPN has been a pain – still unresolved
  • Can’t escape the usual browser issues – and Iron Router adds a few for IE9
  • Speed has been an intermittent problem
  • Still feeling my way towards a good programming style
  • Please check out the app!

(Just click on the slide above and then you can advance through the deck using the arrow keys. Alternatively, click here to see them in a new tab.)


You can see the slides from the talk online here.

  

Harness your Python style to write good Javascript

Python encourages you to write good code. Javascript does not. How can you harness your Python style to write good Javascript?

Classes

Let’s try to write this Python code in Javascript.

class Vehicle(object):
    def __init__(self, size):
        self.size = size
    def __str__(self):
        return "Vehicle of size {0}".format(self.size)

class Hovercraft(Vehicle):
    def __init__(self, *args, **kwargs):
        # initialize just like a vehicle
        super(Hovercraft, self).__init__(*args, **kwargs)
        # add extra custom init if needed
    def hover(self, height):
        print("Hovering at height {0}".format(height))

# eg. make a new size 8 hovercraft
h = Hovercraft(8)
# eg. hover at height 12
h.hover(12)
# eg. change size and print
h.size = 9
print(h)

Not obvious? The problem is that Javascript doesn’t have an out-of-the-box analogue to classes. Somehow we need to adapt objects and functions to the task.

Use prototypes

Your first thought might be to define new object types by defining object constructors, which are called with the new keyword, and adding methods to the object’s prototype chain. These objects look a lot like classes, don’t they? So you might write something like this:

// Not the best approach - see below
function Vehicle(size) {
    this.size = size;
    // or use this and arguments pseudo-arguments
}
Vehicle.prototype.toString = function () {
    return "Vehicle of size " + this.size;
};
function Hovercraft(size) {
    this.size = size;
}
Hovercraft.prototype = new Vehicle();
Hovercraft.prototype.hover = function (height) {
    alert("hovering at height "+height);
}

// eg. make a new size 8 hovercraft
h = new Hovercraft(8);
// eg. hover at height 12
h.hover(12);
// eg. change size and print
h.size = 9
h.toString()

This has a number of problems:

  • Vehicle and Hovercraft have the same constructor, but you can’t reuse it.
  • It feels strange to define the class’s methods outside the constructor, with the prototype lines.
  • There’s no room for private variables or methods.

Use closures

For a better solution, we need to think laterally – and use Javascript’s closures. I was hesitant to make use of these at first, thinking it would only lead to perverse and impenetrable code. But in fact they are a force for good, as we’ll see.

Put simply, a closure is a function which has variables bound to it. You write the closure as a function inside another function. The trick is that the inner function can refer to any of the outer function’s local variables. (Thanks for this succinct summary StackExchange!)

Here’s a nice example of a closure from a talk on functional programming given by Douglas Crockford, JavaScript architect at PayPal and formerly Yahoo (at 57mins). (In fact the previous example is from this talk too.) In this example, a closure is used to produce an object called singleton, which has a private variable, a private function, and two methods. You call the two methods as singleton.firstMethod(a,b) and singleton.secondMethod(c):

var singleton = (function () {
    var privateVariable;
    function privateFunction(x) {
        ...privateVariable...
    }

    return {
        firstMethod: function (a, b) {
            ...privateVariable...
        },
        secondMethod: function (c) {
            ...privateFunction()...
        }
    };
}() );
// note the function is called immediately,
// so the var singleton is its returned value
// the surrounding brackets are just to help the reader

So let’s adapt that to our problem:

function vehicle(size) {
    var that = {
        size: size,
        toString: function () { 
            return "Vehicle of size " + this.size; 
        },
    };
    return that;
}
function hovercraft(size) {
    var that = vehicle(size);  // inherit from vehicle
    that.hover = function(height) { 
        alert("hovering at height "+height);
        return that; // optional - allows chaining
    };
    return that;
}

// eg. make a new size 8 hovercraft
h = hovercraft(8);
// eg. hover at height 12
h.hover(12);
// eg. both at once using chaining
g = hovercraft(8).hover(12);
// eg. change size and print
h.size = 9
h.toString()

In the above, size is accessible to the world, just as it is in Python. But Javascript also lets us make it private:

function vehicle(size) {
    // can define private variables here
    // instead, here we use the fact that parameters are private
    return {
        toString: function () { 
            return "Vehicle of size " + size; 
        },
    };
}

In general, to make your own constructor function (eg. vehicle, hovercraft), you follow this recipe from the talk:

  1. Make an object
  2. Define (private) variables and functions
  3. Augment the object with methods (which have access to the privates above)
  4. Return the object

This pattern has a name: the module pattern. You’ll find a great writeup of it, and some ways to use it across multiple js files, in this post by Ben Cherry. He also points out the best way to handle dependencies, and to update existing variables.

Use chaining

In the example above, I have added the extra feature of “chaining”. In Python you have to put each effect on a separate line, eg.:

h.size = 9
h.hover(18)
print(h)

But in Javascript you can potentially chain it all together into one, like so:

h.size(9).hover(18).toString()

I first discovered the joy of chaining while using Mike Bostock’s super-powerful D3 library. He describes how to do it here – in fact, his description of how to write a reusable chart arrives at the same closure-based solution as we have, with the addition of getters and setters as below.

We made hover chain, but size do it doesn’t yet, as it’s a variable, not a function. Mike solves this by adding a getter/setter function for each public variable, which we could add like this:

function vehicle(startSize) {
    var size = startSize;
    var that = {
        toString: function () { 
            return "Vehicle of size " + size;
        },
    };
    // getter/setter functions
    that.size = function(_) {
        if (!arguments.length) return size;
        size = _;
        return that; // Q: would 'return this' work?
    };
    return that;
}
// eg. this works now
h.size(9).hover(18).toString()

Then h.size() returns (gets) the hovercraft h‘s size, and h.size(9) sets the size to 9.

Another benefit: the code never refers to this. That’s handy, because I find whenever I refactor code into smaller functions I get tripped up by the meaning of this changing.

You may also recognize such getters and setters from jQuery, where eg. $("body").text() returns the page body’s text, and $("body").text("eels") sets the body’s text to “eels”.

Still, as nice as chaining is, needing to add 5 lines of boilerplate code for every variable, with 3 references to the variable name that must be changed each time, is the sort of thing we became programmers to avoid.

To solve this, I am starting to put all the variables with getters and setters into a single object, eg. xt:

function vehicle(startSize) {
    var xt = {size: startSize};
    if (Object.keys) {
        var xtKeys = Object.keys(xt);
    }
    function toString() {
        return "Vehicle of size " + xt.size;
    }
    return {
        toString: toString,
        get: function(name) {
            if (!arguments.length) return xt;
            return xt[name];
        },
        set: function(name, val) {
            if (typeof xtKeys!=="undefined" && xtKeys.indexOf) {
                if (xtKeys.indexOf(name)>=0) {
                    xt[name] = val;
                } else {
                    throw Error("Variable "+name+" not found");
                }
            } else {
                // on browsers without Object.keys or indexOf,
                // don't check the name is valid
                xt[name] = val;
            }
            return this;
        }   
    }
}
// eg. these work
h.get("size");
h.set("size",9).hover(18).toString();

Don’t make it a global

Finally, you probably don’t want to have a new global called vehicle. It’s better to add it to another module, eg. RT (which may or may not already exist), as RT.vehicle. It might also depend on other modules, eg. the underscore (_) library. To do this, wrap the whole function in another closure!

if (typeof _ === 'undefined') { 
    throw new Error('Vehicle requires underscore') 
}
(function(RT, _) {
    function vehicle(startSize) { 
        ... // copy from above
    }

    // attach vehicle to RT
    if (typeof RT==="undefined") {
        RT = {};
    }
    RT.vehicle = vehicle;
    return RT;
}(RT, _));
// eg. make a new size 9 vehicle
v = RT.vehicle(9);

Too crazy?

In conclusion

Javascript’s closures give you access to some interesting programming patterns. Foremost among them, it lets you implement Python-style classes, with the added bonus of private variables and functions. And this is not just an academic gimmick that risks complicating your code in the real world: it is championed by the people who develop javascript, and it is used by jQuery and D3 among others. It helps you to write good, reusable code.

So – please let me know if you’ve used this pattern before, and whether my comparison to Python’s classes stacks up.

A final thought – perhaps it is wrong to compare Javascript to Python after all. Perhaps it is better compared to LISP!

  

Nicely format tables without using table-layout:fixed

I have recently helped to develop Signup Zone, a very handy app where people can easily create their own signup sheets and rosters. People have started using it for interesting purposes, including registering to help injured wildlife! And this has led to people entering in unexpected data, such as very long email addresses; and creating sheets with a lot of columns.

At its heart, the signup sheet is simply an HTML table such as this:

Date Name Contact email address
10/11/2014 Racing Tadpole racingtadpole@example.com

This table has the code:

<table>
    <thead>
        <tr>
            <th scope="col">Date</th>
            <th scope="col">Name</th>
            <th scope="col">Contact email address</th>
        </tr>
    <tbody>
        <tr>
            <td>10/11/2014</th>
            <td>Racing Tadpole</th>
            <td>racingtadpole@example.com</th>
        </tr>
    </tbody>
</table>

Since I don’t know in advance how to size the different columns, I actually really like the browser’s built-in ability to choose its own column widths.

However, the browser will not let any content spill outside a cell, and it only breaks words at white space. So if one user enters a long email address (or URL) into a cell, that column becomes very wide and the table either spills outside the page margins or looks unsightly – or both, like this:

Date Name Email
10/11/2014 Racing Tadpole racingtadpole@example.com
11/12/2014 Another Tadpole anotherracingtadpole@quitelong.verylong.superlong.megalong.example.com

Here are two solutions:

  • Use the new css command hyphens: auto. It won’t work in older browsers (eg. IE 9), and it can hyphenate too many words, but it looks ok, eg:
    Date Name Email
    10/11/2014 Racing Tadpole racingtadpole@example.com
    11/12/2014 Another Tadpole anotherracingtadpole@quitelong.verylong.superlong.megalong.example.com
  • Run some javascript to insert a “zero-width space” or an invisible soft hyphen in places where the offending text can break, eg. at the . or @ symbols. This gives you control over where the words break, eg:
    Date Name Email
    10/11/2014 Racing Tadpole racingtadpole@​example​.com
    11/12/2014 Another Tadpole anotherracingtadpole@​quitelong​.verylong​.superlong​.megalong​.example​.com

    However, if someone copies the email address it will also copy the invisible characters, which could cause trouble.

Here’s some javascript code to implement the second solution in Meteor:

<template name="example">
    ...
    <td>{{{email}}}</td>
    ...
</template>
function htmlEscape(str) {
    return String(str).replace(/&/g, '&amp;')
            .replace(/</g, '&lt;')
            .replace(/>/g, '&gt;')
            .replace(/"/g, '&quot;')
            .replace(/'/g, '&#39;');
}

function myMarkup(str) {
    return String(str).replace(/@/g, '&#8203;@')
            .replace(/\./g, '&#8203;.')
            .replace(/\//g, '&#8203;/');
}

Template.example.email = function(){
    // suppose s is the string we want to show
    return myMarkup(htmlEscape(s));
}

The htmlEscape function is inspired by this Stack Overflow post. An alternative is to use Spacebars.SafeString.

Hope that’s helpful!

  

9 Lessons from PyConAU 2014

A summary of what I learned at PyCon AU in Brisbane this year. (Videos of the talks are here.)

1. PyCon’s code of conduct

Basically, “Be nice to people. Please.”

I once had a boss who told me he saw his role as maintaining the culture of the group.  At first I thought that seemed a strange goal for someone so senior in the company, but I eventually decided it was enlightened: a place’s culture is key to making it desirable, and making the work sustainable. So I like that PyCon takes the trouble to try to set the tone like this, when it would be so easy for a bunch of programmers to stay focused on the technical.

2. Django was made open-source to give back to the community

Ever wondered why a company like Lawrence Journal-World would want to give away its valuable IP as open source? In a “fireside chat” between Simon Willison (Django co-creator) and Andrew Godwin (South author), it was revealed that the owners knew that much of their CMS framework had been built on open source software, and they wanted to give back to the community. It just goes to show, no matter how conservative the organisation you work for, if you believe some of your work should be made open source, make the case for it.

3. There are still lots more packages and tools to try out

That lesson’s copied from my post last year on PyCon AU. Strangely this list doesn’t seem to be any shorter than last year – but it is at least a different list.

Things to add to your web stack -

  • Varnish – “if your server’s not fast enough, just add another”.  Apparently a scary scripting language is involved, but it can take your server from handling 50 users to 50,000. Fastly is a commercial service that can set this up for you.
  • Solr and elasticsearch are ways to make searches faster; use them with django-haystack.
  • Statsd & graphite for performance monitoring.
  • Docker.io

Some other stuff -

  • mpld3 – convert matplotlib to d3. Wow! I even saw this in action in an ipython notebook.
  • you can use a directed graph (eg using networkx) to determine the order of processes in your code

Here are some wider tools for bioinformaticians (if that’s a word), largely from Clare Sloggett’s talk -

  • rosalind.info – an educational tool for teaching bioinformatics algorithms in python.
  • nectar research cloud – a national cloud for Australian researchers
  • biodalliance – a fast, interactive, genome visualization tool that’s easy to embed in web pages and applications (and ipython notebooks!)
  • ensembl API – an API for genomics – cool!

And some other sciency packages -

  • Natural Language Toolkit NLTK
  • Scikit Learn can count words in docs, and separate data into training and testing sets
  • febrl – to connect user records together when their data may be incorrectly entered

One standout talk for me was Ryan Kelly’s pypy.js, implementing a compliant and fast python in the browser entirely in javascript. The only downside is it’s 15 Mb to download, but he’s working on it!

And finally, check out this alternative to python: Julia, “a high-level, high-performance dynamic programming language for technical computing”, and Scirra’s Construct 2, a game-making program for kids (Windows only).

4. Everyone loves IPython Notebook

I hadn’t thought to embed javascript in notebooks, but you can. You can even use them collaboratively through Google docs using Jupyter‘s colaboratory. You can get a table-of-contents extension too.

5. Browser caching doesn’t have to be hard

Remember, your server is not just generating html – it is generating an http response, and that includes some headers like “last modified”, “etag”, and “cache control”. Use them. Django has decorators to make it easy. See Mark Nottingham’s tutorial. (This from a talk by Tom Eastman.)

6. Making your own packages is a bit hard

I had not heard of wheels before, but they replace eggs as a “distributable unit of python code” – really just a zip file with some meta-data, possibly including operating-system-dependent binaries. Tools that you’ll want to use include tox (to run tests in lots of different environments); sphinx (to auto-generate your documentation) and then ReadTheDocs to host your docs; check-manifest to make sure your manifest.in file has everything it needs; and bumpversion so you don’t have to change your version number in five different places every time you update the code.

If you want users to install your package with “pip install python-fire“, and then import it in Python with “import fire“, then you should name your enclosing folder python_fire, and inside that you should have another folder named fire. Also, you can install this package while you are testing it by cding to the python-fire directory and typing pip install -e . (note the final full-stop; the -e flag makes it editable).

Once you have added a LICENSE, README, docs, tests, MANIFEST.insetup.py and optionally a setup.cfg (to the python-fire directory in the above example) and you have pip installed setuptoolswheel and twine, you run both

python setup.py bdist_wheel [--universal]
python setup.py sdist

The bdist version produces a binary distribution that is operating-system-specific, if required the universal flag says it will run on all operating systems in both Python 2 and Python 3). The sdist version is a source distribution.

To upload the result to pypi, run

twine upload dist/*

(This from a talk by Russell Keith-Magee.)  Incidentally, piprot is a handy tool to check how out-of-date your packages are. Also see the Hitchhiker’s Guide to Packaging.

7. Security is never far from our thoughts

This lesson is also copied from last year’s post. If you offer a free service (like Heroku), some people will try to abuse it. Heroku has ways of detecting potentially fraudulent users very quickly, and hopes to open source them soon. And be careful of your APIs which accept data – XML and YAML in particular have scary features which can let people run bad things on your server.

8. Database considerations

Some tidbits from Andrew Godwin’s talk (of South fame)…

  • Virtual machines are slow at I/O, so don’t put your database on one – put your databases on SSDs. And try not to run other things next to the database.
  • Setting default values on a new column takes a long time on a big database. (Postgres can add a NULL field for free, but not MySQL.)
  • Schema-less (aka NoSQL) databases make a lot of sense for CMSes.
  • If only one field in a table is frequently updated, separate it out into its own table.
  • Try to separate read-heavy tables (and databases) from write-heavy ones.
  • The more separate you can keep your tables from the start, the easier it will be to refactor (eg. shard) later to improve your database speed.

9. Go to the lightning talks

I am constantly amazed at the quality of the 5-minute (strictly enforced) lightning talks. Russell Keith-Magee’s toga provides a way to program native iOS, Mac OS, Windows and linux apps in python (with Android coming). (Keith has also implemented the constraint-based layout engine Cassowary in python, with tests, along the way.) Produce displays of lightning on your screen using the von mises distribution and amazingly quick typing. Run python2 inside python3 with sux (a play on six).  And much much more…

Finally, the two keynotes were very interesting too. One was by Katie Cunningham on making your websites accessible to all, including people with sight or hearing problems, or dyslexia, or colour-blindness, or who have trouble with using the keyboard or the mouse, or may just need more time to make sense of your site. Oddly enough, doing so tends to improve your site for everyone anyway (as Katie said, has anyone ever asked for more flashing effects on the margins of your page?). Examples include captioning videos, being careful with red and green (use vischeck), using aria, reading the standards, and, ideally, having a text-based description of any graphs on the site, like you might describe to a friend over the phone. Thinking of an automated way to do that last one sounds like an interesting challenge…

The other keynote was by James Curran from the University of Sydney on the way in which programming – or better, “computational thinking” – will be taught in schools. Perhaps massaging our egos at a programming conference, he claimed that computational thinking is “the most challenging thing that people do”, as it requires managing a high level of complexity and abstraction. Nonetheless, requiring kindergarteners to learn programming seemed a bit extreme to me – until he explained at that age kids would not be in front of a computer, but rather learning “to be exact”. For example, describing how to make a slice of buttered bread is essentially an algorithm, and it’s easy to miss all the steps required (like opening the cupboard door to get the bread). If you’re interested, some learning resources include MIT’s scratch, alice (using 3D animations), grok learning and the National Computer Science School (NCSS).

All in all, another excellent conference – congratulations to the organisers, and I look forward to next year in Brisbane again.

  

Make an animated & reusable barchart

Do you need a dynamic bar chart like this in your web page, with positive and negative values, and categories along the x-axis?

Dog breeds are on my mind at the moment as we just bought a new Sheltie puppy – this chart might show a person’s scores for each breed. Click the button above to see the next person’s (random) set of scores.

This is an example of a reusable chart built using d3. Using it is fairly simple, eg.:

<script src="http://d3js.org/d3.v3.min.js"></script>
<script src="barChart.js"></script>
<script>
    var points = [['Beagle',-10],
                  ['Terrier',30],
                  ['Sheltie',55],
                  ['Greyhound',-24]];
    var myChart = d3.elts.barChart().width(300);
    d3.select("body").datum(points).call(myChart);
</script>

Please check out the source code for barChart.js on bitbucket.

You may also find this helpful even if you don’t need a barchart, but want to understand how to build a reusable chart. I was inspired when I read Mike Bostock’s exposition, Towards reusable charts, but I found it took some work to get my head around how to do it for real – so I hope this example may help others too.

The key tricks were:

  • How to adjust Mike Bostock’s sample code to produce bars instead of a line and area
  • Where the enter(), update and exit() fit in (answer: they are internal to the reusable function)
  • How to call it, including whether to use data vs datum (answer: see code snippet above)
  • How to refer to the x and y coordinates inside the function (answer: the function maps them to d[0] and d[1] regardless of how they started out)

You can find a much fancier version of this chart, with a range slider and hover text, in this post.

Good luck!

  

Monte Carlo Business Case Analysis in Python with pandas

I recently gave a talk at the Australian Python Convention arguing that for big decisions, it can be risky to rely on business case analysis prepared on spreadsheets, and that one alternative is to use Python with pandas.

The video is available here.

The slides for the talk are shown below – just click on the picture and then you can advance through the slides using the arrow keys. Alternatively, click here to see them in a new tab.


You can see the slides from the talk online here.

I also showed an ipython notebook, which you can find as a pdf here. The slides, the ipython notebook, and the beginnings of a library to make the analysis easier, are all at bitbucket.org/artstr/montepylib.

  

How is your tax money being spent?

Want to know how your tax money is being spent? The Australian Budget Explorer is an interactive way to explore spending by portfolio, agency, program or even in more detail; compare 2014 against previous years; and search for the terms that interest you.

Australian Budget Explorer

This was produced in collaboration with BudgetAus. This year for the first time, the team at data.gov.au provided the Budget expenditure data in a single spreadsheet, which Rosie Williams (InfoAus) manipulated to include further data on the Social Services portfolio. The collaboration is producing lots of good visualisations, collected at AusViz.

I won’t editorialise about the Budget here; instead here is my data and extensions wishlist:

Look-through to include State budgets

The biggest line item (component) in the Federal Budget is $54 billion for “Administered expenses: Special appropriation GST Revenue Entitlements – Federal Financial Relations Act 2009″, which I take it is revenue to the States. I would love to be able to “look through” this item into how the States spend it.

The BudgetAus team has provided some promising data leads here.

Unique identifiers to track spending over time

One of the most frequent requests I get is to track changes in spending over time.

Unfortunately this is hard, as there are no unique identifiers for a given portfolio, program, agency or component. That means if the name changes from one year to the next, it is hard to work out which old name corresponds to which new name. E.g. In 2014, the Department of Employment & Workplace Relations has been split into the Department of Employment and the Department of Education, while the Environment portfolio used to be “Sustainability, Environment, Water, Population and Communities”.

It would be great to give all spending an identifier, and have a record of how identifiers map from one year to the next.

What money is actually spent?

How does the budget relate to what is spent? There is some info here at BudgetAus, but the upshot is “This might be a good task for a future group of volunteers”…

Revenue

There is revenue data available here – I haven’t looked at it carefully yet, but I hope to include it, if possible.

Cross-country comparison

It would be great to compare the percentages spent in key areas by governments across the world.
Maybe it’s already being done? To do this I’d need some standard hierarchy of categories (health, education, defence, and subdivisions of these, etc), and we’d need every country’s government (and every State government) to tag their spending by those categories. Sounds simple in concept but I bet it would be hard to make it happen.

In the meantime, my plan is to check quandl for data and see how far I can go with what’s there…

D3

Finally, many thanks to the authors for the awesome d3 package!

Conclusion

If you have any comments or know how to solve any of the data issues raised above, please let me know.

  

Mathematical web apps & visualisation