JavaScript Testing Does Not Scale

(This is a follow-up on my portion of the More Secrets of JavaScript Libraries panel at SXSW.)

It’s become increasingly obvious to me that cross-browser JavaScript development and testing, as we know it, does not scale.

jQuery’s Test Suites

Take the case of the jQuery core testing environment. Our default test suite is an XHTML page (served with the HTML mimetype) with the correct doctype. In includes a number of tests that cover all aspects of the library (core functionality, DOM traversal, selctors, Ajax, etc.). We have a separate suite that tests offset positioning (integrating this into the main suite would be difficult, at best, since positioning is highly dependent upon the surrounding content). This means that we have, at minimum, two test suites straight out of the gate.

Next, we have a test suite that serves the regular XHTML test suite with the correct mimetype (application/xhtml+xml). We aren’t 100% passing this one yet, but we’d like to be able to sometime before jQuery 1.4 is ready. Additionally, we have another version that we’re working on that serves the regular test suite but with its doctype stripped (throwing it into quirks mode). This is another one that we would like to make sure we’re passing completely in time for 1.4.

Both of those tweaks (one with correct mimetype and one with no doctype) would also need to be done for the offset test suite. We’re now up to 6 test suites.

We have another version of the default jQuery test suite that runs with a copy of Prototype and Scriptaculous injected (to make sure that the external library doesn’t affect internal jQuery code). And another that does the same with Mootools. And another that does the same for old versions of jQuery. That’s three more test suites (up to 9).

Finally, we’re working on another version of the suite that manipulates the Object.prototype before running the test suite. This will help us to, eventually, be able to work in that hostile environment. This is another one that we’d like to have done in time for jQuery 1.4 – and brings our test suite total up to 10.

We’re in the initial planning stages of developing a pure-XUL test environment (to make sure jQuery works well in Firefox extensions). Eventually we’d like to look at other environments as well (such as in Rhino + Env.js, Rhino + HTMLUnit, and Adobe AIR). I won’t count these non-browser/HTML environments, for now.

At minimum that’s 10 separate test suites that we need to run for jQuery. Ideally, we should be running every one of them just prior to committing a change, just after committing a change, for every patch that’s waiting to be committed, and before a release goes out…

in every browser that we support.

The Browser Problem

And this is where cross-browser JavaScript unit testing goes to crazy town. In the jQuery project we try to support the current version of all major browsers, the last released version, and the upcoming nightlies/betas (we balance this a little bit with how rapidly users upgrade browsers – Safari and Opera users upgrade very quickly).

At the time of this post that includes 12 browsers.

  • Internet Explorer 6, 7, 8. (Not including 8 in 7 mode.)
  • Firefox 2, 3, Nightly.
  • Safari 3.2, 4.
  • Opera 9.6, 10.
  • Chrome 1, 2.

Of course, that’s just on Windows and doesn’t include OS X or Linux. For the sake of sanity in the jQuery project we generally only test on one platform – but ideally we should be testing Firefox, Safari, and Opera (the only multi-platform browsers) on all platforms.

The end result is that we need to run 10 separate test suites in 12 separate browsers before and after every single commit to jQuery core. Cross-Browser JavaScript testing does not scale.

Of course, this is just desktop cross-browser JavaScript testing – we should be testing on some of the popular mobile devices, as well. (MobileSafari, Opera Mobile, and possibly NetFront and Blackberry.)

Manual Testing

All of the above test suites are purely automated. You open them up in a browser, wait for them to finish, and look at the results – they require no human intervention whatsoever (save for the initial loading of the URL). This works for a lot of JavaScript tests (and for all the tests in jQuery core) but it’s unable to cover interactive testing.

Some test suites (such as Yahoo UI, jQuery UI, and Selenium) have ways of automating pieces of user interaction (you can write tests like ‘Click this button the click this other thing’). For most cases this works pretty well. However all of this is just an approximation of the actual interaction that a user may employ. Nothing beats having real people manually run through some easily-reproducible (and verifiable) tests by hand.

This is the biggest scaling problem of all. Take the previous problem of scaling automated test suites and multiply it the number of tests that you want to run. 100 tests in 12 browsers run on every commit by a human is just insane. There has to be a better way since it’s obvious that Cross-Browser JavaScript testing does not scale.

What currently exists?

The only way to tackle the above problem of scale is to have a massive number of machines dedicated to testing and to somehow automate the process of sending those machines test suites and retrieving their results.

There currently exists an Open Source tool related to this problem space: Selenium Grid. It’s able to send out tests to a number of machines and automatically retrieve the results – but there are a couple problems:

  • As far as I can tell, Selenium Grid requires that you use Selenium to run your tests. Currently no major JavaScript library uses Selenium (and it would be a major shift in order to do so).
  • It isn’t able to test against non-desktop machines. Each server must be running a daemon to handle the batches of jobs – this leaves mobile devices out of the picture.
  • It can’t test against unknown browsers. Each browser needs special code to hook in to triggering the loading of the browser by Selenium, thus an unknown browser (such IE 8, Opera 10, Firefox Nightly, or Chrome) may not be able to run.
  • And most importantly: Selenium Grid requires that you actually own and control a number of machines on which you can run your tests. It’s not always feasible, especially in the world of distributed Open Source JavaScript development, to have the finances to have dedicated machines running non-stop. A more cost effective solution is required.

Naturally, this solution doesn’t tackle the problem of manual testing, either.

A solution: TestSwarm

All of this leads up to a new project that I’m working on: TestSwarm. It’s still a work in progress but I hope to open up an alpha test by the end of this month – feel free to sign up on the site if you’re interesting in participating.

Its construction is very simple. It’s a dumb JavaScript client that continually pings a central server looking for more tests to run. The server collects test suites and sends them out to the respective clients.

All the test suites are collected. For example, 1 “commit” can have 10 test suites associated with it (and be distributed to a selection of browsers).

The nice thing about this construction is that it’s able to work in a fault-tolerant manner. Clients can come-and-go. At any given time there might be no Firefox 2s connected, at another time there could be thirty. The jobs are queued and divvied out as the load requires it. Additionally, the client is simple enough to be able to run on mobile devices (while being completely test framework agnostic).

Here’s how I envision TestSwarm working out: Open Source JavaScript libraries submit their test suite batches to the central server and users join up to help out. Library users can feel like they’re participating and helping the project (which they are!) simply by keeping a couple extra browser windows open during their normal day-to-day activity.

The libraries can also push manual tests out to the users. A user will be notified when new manual tests arrive (maybe via an audible cue?) which they can then quickly run through.

All of this help from the users wouldn’t be for nothing, though: There’d be high score boards keeping track of the users who participate the most and libraries could award the top participants with prizes (t-shirts, mugs, books, etc.).

The framework developers get the benefit of near-instantaneous test feedback from a seemingly-unlimited number of machines and the users get prizes, recognition, and a sense of accomplishment.

If this interests you then please sign up for the alpha.

There’s already been a lot of interest in a “corporate” version of TestSwarm. While I’m not planning on an immediate solution (other than releasing the software completely Open Source) I would like to have some room in place for future expansion (perhaps users could get paid to run through manual tests – sort of a Mechanical Turk for JavaScript testing – I dunno, but there’s a lot of fodder here for growth).

I’m really excited – I think we’re finally getting close to a solution for JavaScript testing’s scalability problem.

Posted: March 20th, 2009

Subscribe for email updates

47 Comments (Show Comments)

Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.

Secrets of the JavaScript Ninja

Secrets of the JS Ninja

Secret techniques of top JavaScript programmers. Published by Manning.

John Resig Twitter Updates

@jeresig / Mastodon

Infrequent, short, updates and links.