The Netflix Prize

I don’t think I could possibly be any more giddy about something, than how I am concerning The Netflix Prize.

In short: Netflix’s vote prediction algorithm gets a deviation of 0.95 stars away from predicting your vote for a movie. If you can do 10% better, they’ll give you $1 million dollars.

That’s awesome and all, but what’s really awesome is their amazing training dataset. This is every data miners wet dream: 100,000,000 votes, 17,000 movies, 250,000 500,000 users.

They have two tests that you can run: One against your known data, and one that you’ll submit to Netflix. As far as I can tell, your standing (aka, your current deviation) is made public. The lower your number, the higher your rank. Every year that the algo isn’t improved by 10%, $50,000 is paid out to the current leader.

Another thing that I find to be interesting: Netflix gets the score that they do without assuming anything about the movie titles, genre, actors, etc. They just do straight number crunching. I’m impressed.

I’ve already got some techniques that I wanna try. I’ve got a feeling that I’m overly optimistic at this point, and that I’m going to be highly disappointed when I see my first score. But first, I have to generate my test bed and get to work, this is so cool.

I don’t know what it is with me and large, nicely formatted, datasets, but I don’t think there’s anything that can get me more excited.

Posted: October 4th, 2006


Subscribe for email updates

13 Comments (Show Comments)



Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.


Secrets of the JavaScript Ninja

Secrets of the JS Ninja

Secret techniques of top JavaScript programmers. Published by Manning.

John Resig Twitter Updates

@jeresig

Infrequent, short, updates and links.