Tag Archives: webapp

Curve fitting with javascript & d3

If javascript is up to amazing animations and visualizations with d3, maybe it’s up to non-linear curve fitting too, right?

Something like this, perhaps:



Here’s how I did it:

  • The hard work is done using Cobyla (“Constrained Optimization BY Linear Approximation”), which Anders Gustafsson ported to Java and Reinhard Oldenburg ported to Javascript, as per this stackoverflow post.
    Cobyla minimises a non-linear objective function subject to constraints.
  • The demo and its components (including cobyla) use the module pattern, which has the advantage of keeping the global namespace uncluttered.
  • To adapt Cobyla to this curve fitting problem, I wrote a short wrapper which is added onto the cobyla module as cobyla.nlFit(data, fitFn, start, min, max, constraints, solverParams). This function minimises the sum of squared differences (y1^2-y2^2) between the data points, (x,y1), and the fitted points, (x,y2).
  • The Weibull cumulative distribution function (CDF), inverse CDF and mean are defined in the “distribution” module. Thus distribution.weibull([2,1,5]) .inverseCdf(0.5) gives the median (50th percentile) of a Weibull distribution with shape parameter 2, scale parameter 1 and location parameter 5.
  • The chart is built with d3. I am building an open-source library of a few useful chart types, d3elements, which are added onto the d3 module as d3.elts. This one is called d3.elts.xyChart.
  • So the user interface doesn’t get jammed, I use a javascript web worker to calculate the curve fit. I was surprised how easy it was to set this up.
  • I apologise in advance that this sample code is quite complicated. If you see ways to make it simpler, please let me know.
  • Finally, this may be obvious, but I like the rigour that a tool like jshint imposes. Hence the odd comment at the top of fitdemo.js, /* global fitdemo: true, jQuery, d3, _, distribution, cobyla */

Check out the source code on bitbucket here.

Please let me know what you think!

  

Experience with Meteor

Meteor Day is upon us!

Here are the slides I am going to present on my experience developing with Meteor. In a nutshell -

  • Love the principles – they made it easy to develop a great webapp quickly
  • I’ve done a few cool things, I think! (like encryption, undo, admin panel with datatables, custom art:accounts-ui)
  • But… SEO is killing me (only two keywords showing!?)
  • The Paypal IPN has been a pain – still unresolved
  • Can’t escape the usual browser issues – and Iron Router adds a few for IE9
  • Speed has been an intermittent problem
  • Still feeling my way towards a good programming style
  • Please check out the app!
(Just click on the slide above and then you can advance through the deck using the arrow keys. Alternatively, click here to see them in a new tab.)


You can see the slides from the talk online here.

  

9 Lessons from PyConAU 2013

A summary of what I learned at PyCon AU in Hobart in 2013. (Click here for 2014.)

1. In 2005, Django helped make it possible for a team of ONE to make a commercial web app

Building web apps with Django is not just possible, it’s fun. I hadn’t realised the key role that Django played, along with Ruby on Rails, in making this happen.

2. But in 2013 the goal posts are higher – can it still be done?

Django was revolutionary when it was released, but it doesn’t take care of everything a modern (i.e. 2013) web app needs to be cutting-edge. On the back-end, once you get your head around Django itself, you need to get your head around South (for database migrations), virtualenv (so you don’t go crazy when new versions come out), the Python Image Library and django-filer or easy-thumbnails so you can upload images and files more nicely, Fabric to help you deploy your site, git (to version control your code, if you haven’t used it already), selenium (for functional testing), factory_boy (for any testing), django-reversion (so you can roll back data), staticfiles, a way to actually deploy static files on your system, e.g. a file system backend like Boto, tastypie or django-rest-framework (for an API), and perhaps a CMS like Django-CMS, Mezzanine or FeinCMS (which are the tips of other icebergs). That’s sort of where I’m up to at the moment. And there are lots more I will probably need soon - haystack (for faster searching), celery and a message broker (e.g. for non-web-page related tasks), memcache, maybe non-relational databases like MongoDB.

And that’s just the back-end. On the front-end you probably want to use javascript, ajax, jQuery, and probably another javascript library, e.g. I have been using kineticjs. But during the talks I learned I will need to consider meteor (heaps of cool stuff, but a starting point is that it drops a lot of the distinction between server and client, so that with very little code, a user can update the database and other users’ pages update to view it automatically), backbone.js (“models with key-value binding and custom events, collections with a rich API of enumerable functions,views with declarative event handling, and connects it all to your existing API over a RESTful JSON interface.”), angular.js (“lets you extend HTML vocabulary for your application”), D3.js (“data driven documents”), node.js, compass and SASS (to make css easier), ember.js (“a framework for creating ambitious web applications”), yeoman (“modern workflows for modern webapps” using Ruby and node.js)…

The keynote of DjangoCon AU by Alex Gaynor explained this in a historical context and sowed the idea in my mind that the time is ripe for a new framework (possibly an enhanced Django) that will make all these things easy as well (roughly speaking). Jacob Kaplan-Moss said to check out the Meteor screencast for what is possible.

3. Web security is never far from our thoughts

Jacob gave a great talk on web security.  As I mentioned above, Django takes care of the essential security features – CSRF tokens, SQL injections, password hashing and HTML cross-site scripting. Some immediately useful tips I picked up from Jacob are – always use https everywhere if you have user logins; django-secure makes this easy (“Helping you remember to do the stupid little things to improve your Django site’s security.”); use bcrypt for password hashing; use Django’s forms whenever there is user input, even if it’s not a form; turn off unused protocols (e.g. XML and yaml) in your API; and to emphasise how easy it is for others to intercept your unencrypted data, look up Firesheep.

4. Python packages for maths and science are making “big data” much more accessible to everyone

Lots of talks on this. Check out especially the scikit-learn documentation, which is incredibly thorough. But then there’s Pandas, scipy, and scikit-image, and for networks networkx.

For parallelization, the classic algorithm is mapreduce, and mrjob provides an python interface to this.  The easiest way to get started on parallelization is to use IPython.parallel. For an example, check out how to process a million songs in 20 minutes. For queuing jobs and running them in the background, redis-queue has a low barrier to entry. (One caveat – you may need to manually delete .pid files.)

An interesting quote – “Most of the world’s supercomputers are running Monte Carlo simulations.”

5. There are lots more packages and tools to try out

To improve my style, I want to check out django-model-utils (especially for “PassThroughManager”); and more generally, django-pipeline (for “CSS and JavaScript concatenation and compression, built-in JavaScript template support, and optional data-URI image and font embedding” – in preference to django-compressor), django-allauth (an “integrated set of Django applications addressing authentication, registration, account management as well as 3rd party (social) account authentication.”), django-taggit (to add tags to your project), Raven (the python client for Sentry, “notifies you when your users experience errors”), django-discover-runner (which will be part of Django 1.6 – it allows “you to specify which tests to run and organize your test code outside the reach of the Django test runner”), and django-sitetree (“introducing site tree, menu and breadcrumbs navigation”).

There’s more… Mock for testing (“allows you to replace parts of your system under test with mock objects and make assertions about how they have been used”), separate selenium tests into tests and page controllersGerrit (for online code reviews), Jenkins (“monitors executions of repeated jobs”), django-formrenderingtools (“customize layout of Django forms in templates, not in Python code.”). There’s a way to resize images in html5 before uploading them. And Fanstatic serves js and css files (e.g. specify you need jQuery through a python statement rather than in the template), though I’m not sure why I would need this yet.

If you need to kill off a process that’s taking too long you can use interrupting cow and django-timelimit.

There’s a way to compile clojure to javascript.  Since I don’t know clojure yet, this is a very speculative project for me, but I like the idea of avoiding javascript. :-)

And if you’re writing tests in iOS, there’s a way to run selenium on the iOS simulator using appium.

6. I still have a lot to learn about Python

I won’t embarrass myself by listing all the things I learnt about Python here, though we were encouraged not to be afraid of the CPython source code, and even less so of the PyPy source code (which has the advantage that it is in python!).

I was convinced I should be trying to use Python 3.3 whenever possible, if only to save time later with unicode errors – Python 2.x doesn’t handle these well. Django 1.5 is actually written in Python 3.3, using a package called six to make it work with Python 2.x too.  Incidentally, it also seems the consensus is to use PostgreSQL over MySQL. Though admittedly that doesn’t really fit under this heading.

7. The Python community is friendly, humble and welcoming

Good news! This keeps it fun to program in Python as much as anything.

8. PyCon was a great conference

Of all the scientific and industry conferences I have been to, this one had the best-presented talks I have seen – and not just the scheduled presenters, but also the lightning (5 minute) talks. They were very engaging and intelligible.  Speakers used their slideshows in inventive ways (e.g. using memegenerator, prezi.com and the odd xkcd cartoon).  And the conference itself was well organised by Chris Neugebauer.

9. Next time I’ll stay for the sprints!