|progress on the netflix front
||[Oct. 11th, 2006|08:39 am]
So, although djedi 's parents were here most of the last week, I managed to make some progress on the Netflix Prize. (probably at the cost of being a bit rude - sorry!) Most of the time I spent was building up a test framework so I could easily test rating schemes, so there wasn't a whole lot of note. The one big things was that I restored the data provided to us in a binary format, so it takes up 500 MB on disk instead of 2 GB, which means the loading time is down to 15-20 minutes instead of 40 minutes.
So the way it works is that there is a real set of ratings to be made that counts, and there is a "probe" set of ratings to be made. The "probe" ratings are all already known (they're included in the 500 MB of data), so it makes it easy to exclude that data, run your algorithm, and then see how good your data is. Last night was the first time I got to actually run some algorithms and see how they did on the probe data. The scoring method used is RMSE (root mean squared error), so lower is better. For comparison, Cinematch (Netflix's algorithm) scores 0.9514, if you get below 0.9419 (1% improvement) you can win $50,000, and if you get below 0.8563 (10% improvement) you can win $1,000,000. (here's the current leaderboard - note that someone qualifies to win $50,000 already!)
Just taking the average movie rating for all movies and applying that gives an RMSE of 1.13 or so. Taking the average movie rating for each movie and using that on a per-movie basis gives 1.05. The two other things I tried were first to take the average movie rating and average user rating and average them - this gives a modest improvement to 1.015. I then weighted them by the inverse of their standard deviation (since a higher standard deviation would mean there was more variability in that data and thus it would be less reliable), but that only improved it to 1.013.
So now I'm calculating the correlation between every pair of movies using the dot product. (I ran this overnight and it took around 7 hours, but I need to store the data in a different format so I started that before leaving for work). Once I have that data it should be fairly straightforward to apply that to all the other movies a user has rated and come up with a new rating. I think this might push me below 1.0, which would be really nice...and I might even submit that even though right now it needs to be below 0.9884 to make it on the leaderboard.
I also found a good paper that I'm reading through and I might try next, although it looks computationally really expensive.
Anyway, that's most of what's been on my mind lately. I like getting to play with raw data!