I’ve long been interested in the concept of A/B testing (Also called split testing). It’s a simple concept that should sit will with most mathematically-inclined types: You have a baseline interface in which you adjust a single variable, at random, for each user that visits your application. After a given amount of time you should be able to see if certain variables affect how your users behave (either negatively or positively).
A product was recently released called SnapAds which allows its users (advertisers) to permute different variations of an ad and display different versions to users, based upon how well they perform over time.
But that’s not what I was interested in, specifically (even though it is a cool idea). The team that created this also created another product a while back that never saw a full release: Genetify. Genetify provides developers with a JavaScript library for doing any number of A/B tests on a site (tweaking CSS, JavaScript, or HTML elements) all trained over time using a Genetic algorithm backend.
This means that no matter how many different A/B tests you have on a page the genetic algorithm will adapt to the input (users visiting the page and hopefully achieving some pre-defined goal) and slowly show a more-optimal page layout to the user.
Genetify provides a demo on their site showing the basics of how it works along with a simple text tutorial.
To get started with Genetify you will need to include the library in the head of your page along with a couple CSS rules.
<script src="genetify.js"></script> <style> .v { display: none; } .genetify_disabled { display: none !important; } .genetify_enabled { display: block !important; } </style>
And then before the closing body tag on your site include the following:
<script>genetify.vary();</script>
Here are some examples of different ways in which a page layout can be changed using Genetify.
HTML Elements / CSS Classes
The easiest technique is one which allows you to simply toggle HTML using some inline CSS classes.
<div class="sentence">One way of saying something</div> <div class="sentence v anotherway">Another way of saying something</div>
The first class specified ends up becoming the name of one of the “genes” which is used to train the genetic algorithm. Thus if the user completes a specified goal while the “anotherway” element is toggled then the algorithm is trained to recognize that showing the “anotherway” element might be more desirable and will show it more over time.
A goal can be recorded by specifying a goal name and a weight for completing the goal. You’ll need to call a JavaScript method that records the completion of a goal wherever in your code you think the goal was completed (such as the user signing up for something).
genetify.record.goal('signup', 100);
The next-most-common technique will likely be that of toggling CSS rules. Multiple rules are defined using a similar name but with the addition of a simple alphabetical name on the end which is used for categorization.
#navbar { color: red; } #navbar_vA { color: green; } #navbar_vB { color: blue; }
Note that you can have any number of rules – you aren’t limited to the traditional “A/B” style of testing where there’s only two options – specifying any number of rules will continue to yield results.
Finally Genetify provides the developer with the ability to toggle JavaScript variables.
highlight = function(elem){ elem.style.borderColor = 'green'; } highlight_vRed = function(elem){ elem.style.borderColor = 'red'; }
I’m less excited about this particular technique – it’s kind of clumsy to clutter the global namespace with variables and to expect a changed output. I think a better technique would be to toggle a property value within the genetify object and check how that’s changed, instead.
Perhaps the biggest question that surrounds Genetify, right now, is over its longevity and openness. It seems like the project has taken the backburner in favor of the team’s other project, SnapAds (and understandably so – since that has an obvious revenue stream). Although a recent comment by its creator, over at Hacker News has fueled speculation: He’s looking for interest in the possibility of open sourcing the codebase for anyone to use.
Right now Genetify is two components:
- The JavaScript frontend that does the A/B rotation of site components. Currently this file is only available on the Genetify demo site as a doubly-packed file (using Dean Edwards’ Packer). It wasn’t too hard to un-pack it and run it through a JavaScript beautifier in order to get some sane output. Of course that doesn’t make it any more “open source” – just developer-readable. You can now tweak the code to communicate with the server of your choosing, instead.
- The Genetify backend is unclear at this point. It appears to only exist on the Genetify servers and it’s not clear if the backend will accept input from non-Genetify domains. If the intent of the team holds true then should probably see this code become available soon.
I’m already quite excited about this utility. I think it shows a lot of promise for developers who want to roll their own A/B testing solutions. I hope the team comes through and releases an Open Source solution that developers can really start to hack on it.
Greg Dingle (November 26, 2008 at 12:11 am)
Hi, I’m the author of Genetify. Thanks for the great write-up. You really dissected it well with the little information available.
Genetify was a solo project of mine before I joined the SnapAds team. I had wanted to make it into a product but the advertising market presented a much more lucrative opportunity. Now that I’ve received some interest in Genetify I’m very interested in reviving it and letting others help develop it. It is still much more elegant than Google’s Website Optimzer.
I’d appreciate any feedback on how to set up Genetify as a open-source project. Some big questions are: Where to host it? Google code, Git hub, other? How to modularize the backend? Provide a single public service, the current minimalist PHP code, a Django app or what?
And by the way, I’m not happy with the JS integration either. I did it mostly so that Genetify would operate on all client-side code in the same way.
Greg
James Byers (November 26, 2008 at 1:10 am)
I vote for Google Code (nice and simple) and minimal PHP backend (likely easy to understand for most developers).
pd (November 26, 2008 at 3:45 am)
John I really appreciate your increased rate of posts lately, they are all very enlightening.
However I thought yourself, Mozilla, Jason Orendorff etc were going to be putting a lot more effort into Firebug? There doesn’t seem to have any increase in the pace of Firebug releases.
Currently there are substantial bugs in Firebug such as when a User Agent style is ‘disabled’ it is actually deleted from the current browser session!
Regardless of how you spend your time, thanks for the great work you do in general.
pd
John Resig (November 26, 2008 at 8:38 am)
@Greg: Great to hear that. I think I’m going to side with James on this and vote for something simple like Google Code and PHP. Definitely drop me a line if you have any questions.
@pd: You should follow the weekly updates of the Firebug team – it’ll help to keep you more informed. The latest update shows our current status. And to clarify Jason isn’t working on Firebug, last I checked he was helping with JavaScript-engine related efforts.
Rob Howell (November 26, 2008 at 8:42 am)
Hi John – very interesting article. I really like the idea of setting up a page with a number of variations, then letting evolution automagically take care of which should eventually be used.
I’d love to try out this kind of code on a Jaxer setup – could then keep it written Javascript end-to-end. Also, would be very cool to see it integrated server-side into a decent CMS. Now we just need someone to write a decent CMS for Jaxer…
Gregory (November 26, 2008 at 1:45 pm)
Agreed with the Google Code suggestion.
Basically just put the code up – a community will adapt it from that point on. I might write a python backend for example, someone else might port it to .Net
OSS is like that :) The important thing is to get the code out there.
Rob Hudson (November 26, 2008 at 1:45 pm)
@Greg: I’d definitely be willing to help implement a backend in Django that talks to the Javascript API. Keep me posted where this goes. Thanks!
Skorgu (November 26, 2008 at 2:39 pm)
I’d actually vote for github myself, git makes parallel and offline development a lot easier. The backend can be whatever, as long as there’s a reference one there will be a Python, Ruby and probably Erlang and Bash clone inside a month.
pd (November 26, 2008 at 5:58 pm)
Thanks John, I appreciate your diplomatic response to my possibly prickly inquiry :)
Michael Terry (November 26, 2008 at 10:19 pm)
GitHub is a light years better alternative.
Paul Hanlon (November 27, 2008 at 10:01 am)
I’ll add too that it’s great to see more posts from John. How do you find such interesting stuff? That JS beautifier is excellent!
The genetify script is a great concept and looking at the code, it could probably be halved by JQuerifying it.
It looks like the really interesting work is in the PHP backend, and it’s great to see Greg wanting to share it.
Just for haw haws, any point in hosting it on Appjet. I seem to recall a very favourable posting from John about that not so long ago.
John Snyders (November 27, 2008 at 11:05 am)
Cool idea but I wish the genetify.txt gave more details on the GA on the back end. Other than genes it doesn’t use the usual GA terminology. What is the relationship between the goal and the fitness function? What genetic operators are used? When are new generations created?
Richard Fink (November 27, 2008 at 6:24 pm)
“Split tests” have been a part of effective advertising for, at least, 70 years or so.
Web technology makes it so much easier to do, but yet I haven’t read anything about it until now. Odd.
Thanks for mentioning this. Bookmarked!
Jesse Farmer (December 2, 2008 at 1:26 am)
Genetic algorithms and A/B testing aren’t the same thing, but this is still clever. SnapAds is the right product for this technology, though, rather than generic website optimization.
Why?
Genetic algorithms produce black box solutions. For ads all you care about is the bottom line, but answering the question “why?” is very difficult, if not impossible, with genetic algorithms.
And customer insight is arguably more valuable than a mere optimized webpage.
Jesse Farmer (December 2, 2008 at 1:31 am)
@Richard: If you Google “A/B testing” you’ll see lots of people think about this. I’ve even done a multi-part series on the topic: http://20bits.com/tag/ab-testing/
Tiago Serafim (December 2, 2008 at 6:04 pm)
@Jesse Farmer: really nice tutorials. Thanks for sharing.
Luke Stevens (December 2, 2008 at 9:19 pm)
This is definitely the future for web design (in the broadest sense). I wrote an article about it ~18 months ago: http://design2-0.com/articles/in-the-future-web-sites-will-design-themselves/ . A/B & multivariate testing certainly isn’t new, but it’s going to become very popular in the next couple of years I think.
I launched a personal project recently to demonstrate the concept in a very simple way by varying link styles with CSS, and measuring what gets the most clicks (you can see it in action and read about it here: http://newsified.com/about/ ). I’m currently using Google Website Optimizer to serve up CSS variations, collecting data with Google Analytics, and then manually crunching the number, which is not the most elegant solution, so I’d love to see what happens with Genetify.
The great thing with my set up though is I can slice & dice all my data within Google Analytics for things like visit time, bounce rate, geographic data, browser/OS combo’s etc etc. When Google finally release their announced Analytics API (the just out-of-beta event tracking looks promising too), I’d love to see products like Genetify hook into that.
Google has been doing a really good job evangelising A/B and multivariate testing for the sake of GWO, but GWO as a product is kind of stuck in landing-page land – you can bend it for other things (which is what I’m doing), but it’s kind of a kludge.
The biggest drawback in this kind of thing is the sheer volume of visitors and actions you need to get statistically significant results (which are the only kind of results :). For every additional variation you multiply your total combinations, and therefore the traffic needed/experiment length, which all becomes quite complex. I can see why SnapAds would be much more successful where you have more exposure to fewer variations in a tightly controlled setting, so I hope they do well with that.
It really is the future of design though. Why should I have layout (a) and not (b)? Why should I have ten headlines and not five, or twenty? All unknowable questions… until you test :) It is all about the data!
Greg Dingle (December 6, 2008 at 4:35 pm)
Ok, folks, I put some time into cleaning up genetify and writing docs. Anybody can install it now try it out for themselves. Hopefully some developers out there will pick up on my initial efforts and genetify will grow to become a thriving open-source project.
http://github.com/gregdingle/genetify/wikis/home
Any feedback is appreciated.
Tiago Serafim (December 7, 2008 at 10:06 am)
@Greg Dingle: thanks for sharing. Your project motivated me to study GA. I´m starting with “A field guide to Genetic Programming”. Do you have any other recommendation?
Thanks again!
aimxhaisse (December 8, 2008 at 6:11 pm)
Hi,
A/B testing seems to be very promising ! I’ve recently seen on hacker news an example of implementation of genetic algorithm in flash : http://bit.ly/fLgb . This is very amazing and I have no doubt we’ll see more and more apps like this in the future.
Erik Vold (December 9, 2008 at 10:18 pm)
@Luke Stevens: I agree with you when you say “GWO as a product is kind of stuck in landing-page land – you can bend it for other things (which is what I’m doing), but it’s kind of a kludge.” it needs a loooooot of work.
@Greg & John: genetify seems to ignore SEO concerns, further more it could be mistaken for cloaking (very bad), what have you got to say to that?
In a GWO A/B test, there are really two pages, so forget this example.
A GWO MVT (which you can use to do A/B testing too) is SEO friendly, because only the original content is visible to a robot.
Greg Dingle (December 12, 2008 at 3:59 pm)
@Erik: I don’t think SEO should be a concern because Genetify doesn’t do anything differently than a designer would do to randomly show one of a set of things with javascript–a legitimate feature. It may even be safer than varying things server-side because the document is always the same. There are a number of articles supporting the use of display:none in light of SEO. Here’s one: http://searchengineland.com/is-hiding-content-with-display-none-legitimate-seo-13643.php
Lars Trieloff (January 13, 2009 at 4:29 am)
Michael Marth has ported Genetify to use Apache Sling (or Day CQ5) as a backend which allows much closer integration into content centric applications or web content management systems.
There is an in-detail blogpost available here: http://dev.day.com/microsling/content/blogs/main/genetify.html and a git branch here: http://github.com/michaelmarth/genetify/tree/master
Grey (January 31, 2009 at 5:47 pm)
Now, wouldn’t the natural extension of this involve multivariate testing? E.g. not having only XOR relationships between choices, but also OR ones (i.e. both things at the same time).
That could occasionally lead to some strange things without using rules like “ok variate this text color for background colors A-M, but not N-Z”, but it sure would be fun to see the appearance of a website vary like that.
Of course, there will always be people who don’t like surprises and have a hardcoded set of options saved in e.g. a cookie or a database. These are less useful to the process, though. Given that genetic algorithms get better the more often you run them, there could be some kind of problem where a site cannot generate enough traffic due to constantly changing site design, though only with a lot of traffic will the design stabilize eventually.
Just some food for thought, or maybe just an lone voice in the void.
Matt Gershoff (February 4, 2009 at 5:41 pm)
Gregg,
Just looked over the Genetify website. Very cool. I think the way you are looking at the problem is in the right direction. For some reason the market seems to be in lock down with hypothesis testing. If all you care about is the optimal policy then it certainly is not clear that A/B or MVT is optimal and can be a huge waste of resources from a min regret perspective.
Sure, if you want to generalize about higher level structures and patterns then by all means test but otherwise you may spend too much time searching in low value areas of the policy space.
Have you compared your approach with just treating optimization as a simple n-armed bandit problem? My guess is that is what TouchClarity was doing and hence Omniture’s TestTarget.