Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

I'd argue that "user happiness" isn't the goal for Netflix, long-term revenue is. That's relatively easy to measure, and certainly easier than something nebulous like "user happiness." You can even test different recommendation algorithms and see which maximizes long-term revenue.

Presumably Netflix knows that the recommendation algorithm has a significant impact on their bottom line, which is why they launched the Netflix Prize to outsource new algorithm development.

Now, Netflix can't give revenue data to third parties, and they also don't want to let third-party recommendation algorithms run on their system because an "average" algorithm will hurt their bottom line.

The question then becomes: which well-understood metric correlates best with long-term revenue?

Perhaps the answer is RMSE, which is why Netflix chose it. That doesn't seem totally implausible to me.



You'd expect that. In the recommendations world that's called "business rules" and includes things from skewing results based on margins to not showing inappropriate recommendations (say, women's clothing to men).

However, I'm pretty sure that Amazon's recommendations don't do that, or don't do it much, anyway. Their "similar product" recommendations seem to be on a very simple (and often mediocre quality) pure counting correlation between two items purchases. It's much harder to guess which algorithms are at work for personalized recommendations.

At the end of the day, profit margins aside, there's a lot that goes into optimizing recommendations that can't be easily measured. How do you measure customer loyalty based on good recommendations? There have been a number of market research studies that indicate that recommendations do drive customer loyalty, but it's hard to say where the sweet spot is between skewing things toward higher margins vs. skewing things towards customer utility. About 80% of Amazon's visitors aren't there to buy stuff -- and that's great for them! They've become an information portal / window shopping location that happens to also sell stuff. Which is a great position to be in when somebody does think of buying stuff.

That Netflix uses RMSE for their contest doesn't bother me. What I think Greg is reacting to (and certainly my sentiment, again, this is really similar to something I'd been writing) is that there's becoming a blurring between stimulus and response here and there's the assumption, if not in this subfield, certainly among those casually tracking recommendations advances, that RMSE is a good way of measuring a recommendations algorithm, not just, "the metric Netflix is using", when in fact, it's really a much more inexact science.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: