Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Why Ken Jennings’s ‘Jeopardy’ Streak Is Nearly Impossible to Break (fivethirtyeight.com)
100 points by mblevin on May 6, 2015 | hide | past | favorite | 57 comments


> So we (or, more precisely, FiveThirtyEight’s lead lifestyle writer Walt Hickey) ran a simulation that flipped a weighted coin with a 97.9 percent chance of landing on a Jennings win. Every time it did, we “flipped the coin” again. We did 1 million of these simulations.

This is just a binomial distribution. Did they seriously estimate the answer instead of solving it analytically?


Given that we stop after the first failure technically this should be a negative binomial distribution (http://en.wikipedia.org/wiki/Negative_binomial_distribution) with p = 0.979 and r = 1. The mean is p / (1-p) = 46.62. Also there is a p ^ 75 = 0.204 chance that someone of same skill will achieve a better score.


The waiting time distribution is actually the geometric distribution (with expectation 1/p = 1/0.021 = 47.6, so about what they got).


There's the didactic argument where they did it analytically but explained the intuition of the analysis mechanically. There's also the oversimplification argument where they actually did some kind of much larger sampling from a trained decision tree but then boiled the mechanism down to something easily understood by casual readers more interested in "Jeopardy!" than statistics.


I'm not certain about 538s target demographic these days.

I'm gonna guess that they did solve it analytically, but for the sake of exposition and readers that might not know what a binomial distribution, they bring up the coin-flip as an illustrative example.


It never occurred to me that optimized buzzer timing was such a big part of Watson's dominance. Kinda takes some of the romance out of it.


Something I never liked about jeopardy is that you can't buzz in at any time.

I played on a trivia team at the high school level, and most questions were structured like:

"This American architect, born in 1867, is known for works such as the Robie House, the Guggenheim Museum, and Fallingwater."

So they typically feed you information from least to most well known, and you can buzz in at any time. This rewards players who know the most trivia. Everyone who played knows Frank Lloyd Wright did Fallingwater, but a very good player would know his approximate date of birth and would buzz in immediately on hearing 1867.

I understand that this might make Jeopardy less fun for the audience, but I feel it would be a better test of trivia instead of buzzer timing.


They made it that way on purpose, so it's more entertaining for the audience. If you could buzz in early, the players would almost always buzz in before Alex Trebek finished reading the question, which makes it less interesting for the audience.

With the wait-to-buzz-in rule, the audience gets an chance to understand and maybe think of the answer before the contestants speak.


You could allow contestants to buzz in as soon as possible, but have the system wait until Alex has finished reading the answer (remember, it's Jeopardy! so the contestant gives the question ;-P) before it indicates which player buzzed in first. Yes, there might be an audible clicking sound in the background while he's talking, but it would make the game more about speed of recall rather than hand-eye coordination.


I think that would just encourage contestants to buzz in immediately, since I imagine that the risk of penalty of not knowing a question pales in comparison to the reward of getting to answer for a strong contestant. You effectively eliminate the only downside of buzzing early since contestants would get to hear the entire question regardless.


The traditional College Bowl <TM> / Quizbowl <not-TM> technique is (a) to assign a 50% penalty to the early buzz with an incorrect answer, and (b) since it's game with teams and has two teams, the rest of your team is disqualified from answering the question, leaving the opposite team with lots of time to think about it (and subsequently secure access to the 3x bonus question following every toss-up question.)

Which is lots of fun and very competitive, but a little bit less tele-savvy.


Ack, we ran up against a "celebrity" reader in the final rounds of a Quizbowl tournament who didn't stop reading reliably when the other team buzzed in. I don't know how, but the other team was really good at taking advantage of this, buzzing in before the reader had given enough information to answer the question, and then answering after he'd finished the sentence. Our team got steamrolled.


In Highschool Quizbowl, I'd take advantage of the fact that most untrained readers have a hard time stopping mid-syllable, or even mid-word, so any time there was a question that hinged on a single word, such as a world capital question, I'd buzz in on the first letter of the word, and they'd usually get out 2 syllables before stopping, providing enough information.

Q: What is the capital of L-(BUZZ)-ithu-- [stops reading].

A. Vilnius.

But if they had had really good reaction time and stopped on L--, I'd have been hosed. Latvia, Lithuania, Liechtenstein, Libya, Liberia, etc...


The team that beat us was doing more

Q: What is the capital of (BUZZ) Lithuania?

A: Vilnius.


Agreed that strong contestants would buzz in immediately.

One solution would be to display the text of the question slightly beforehand to the audience, but not the contestants.


> You could allow contestants to buzz in as soon as possible, but have the system wait until Alex has finished reading the answer

I think one of the reasons Jeopardy! is so popular is that it makes the audience feel smart. I'm not amazing at trivia, but I know most of the Jeopardy! answers and having the contestants wait to buzz in creates the illusion that I'm doing as well / better than they are. Even though I'm intellectual aware of the importance of buzzer timing, it still feels good to watch and give the answers before the contestants have a chance to demonstrate they know. Compare that to a trivia show like University Challenge in the UK, where not only are the questions remarkably hard, the contestants buzz in before I've had a chance to parse, let alone answer, the questions.

I think allowing contestants to buzz in early, even if Alex finished reading the question before they could answer, would make Jeopardy! a better trivia game but a worse TV show.


This is spot on. Jeopardy is 100% a TV show. I also have to agree with you about University Challenge - you are doing well if you can answer a 1/4 of the questions. I have always wondered who is the audience for this show as it is more like a sporting event than a quiz show.


All this would do would change the timing of when the buzzer gets pressed. All of the players would start to buzz in as soon as Alex started reading, knowing that he would finish the answer and they would have a chance to think before he finished reading it.


"as soon as possible"? Because I guarantee if you did that, you'd have people buzzing in as soon as Alex opened his mouth.

Being first to buzz is a huge part of the game, and no matter when you turn the buzzers on, there will be people trying to buzz at that moment.


The problem with this solution is it would then become the optimal strategy to simply buzz as soon as the answer is being read in the hopes you know the question once the host has finished reading the answer. That would make the game far less interesting for everyone involved.


Yep. The reason why Jeopardy works so well is because the contestants are all trivia gods, so they all know the answer most of the time. The audience thinks that the difficulty is in the trivia, because that's more challenging for us, but it's actually in the buzzer press.


You know how annoying it is when someone tries to finish your sentence and they are completely wrong? I imagine Jeopardy without the rule to be unwatchable.


You mean not even slightly annoying? Seriously, it turns out the degree to which this bothers people varies enormously. I'm not sure how much it's personality vs subculture, but in my branch of reality, someone completing my sentence (even if they're wrong) is a mark of enthusiasm and engagement.

My wife, and my neighbor's husband, feel like it's the height of rudeness.

This is kind of the canonical example people can understand for the platinum vs the golden rule, as it happens.


I have no idea what I read, maybe Flow, maybe Think and Grow Rich, that mentioned when you're in tune with a presenter/speaker, you'll often find yourself finishing their thoughts. I used to subscribe to that HEAVILY and constantly look for it. But over time, I find when I'm making a point and taking my time to ensure that what I am saying is clear and no extra parsing language is needed, I do not need, or want, a person audibly trying to be engaged. Especially when they're wrong. It actually isnt that big of a deal, we just all have our thing


I'd contend that the "signal when permitted" minigame makes Jeopardy! more difficult, too, because the part of your brain that arrives at the answer must yield to the part of your brain that tenses for a stimulus. I definitely have to switch "modes" between pulling up deeply buried knowledge and rapid reaction "stance", and I noticed this when I played a well-made copy of the show's rules complete with the signaling system.

There's another side, too, which is if folks could signal early, reading fast would become an even stronger optimization. It already is, but at least everybody gets an equal chance the way it is (even if they got the answer fairly quickly from reading the clue while ignoring Trebek).


These days, Jeopardy shows the full question as an overlay graphic immediately, before the host even starts reading it. So the audience would always know the full question even if the contestants don't.

On the other hand, I wouldn't be surprised if the target demographic of Jeopardy is often doing something else and listening without actually watching.


> I understand that this might make Jeopardy less fun for the audience, but I feel it would be a better test of trivia instead of buzzer timing.

Other game shows work that way as well, and it has its own degree of fun for the audience. For instance, the contestant buzzes in and answers after a few words of the question, gets the answer wrong, the host is grinning like a possum, and after they read the next part of the question, it makes the contestant's answer even more hilariously wrong in context.


University Challenge in the UK is a great example of this.


And as a bonus, the host may grin like a possum when the answer is correct, too. That happens on trivia questions, such as on winners of the Eurovision Song Contest. Contestants are supposed to be British students, know their Shakespeare and recognize 13th century music from four notes played, even if they are studying, say, analytic chemistry. You may get a snide remark "Aren't you ashamed to know that?"


This is a TV show - the entire point is to make it fun for the audience. The audience is what brings in ad dollars, which pays not only the winnings of the contestants, but costs to run the show, and of course, the profit.


With Jeopardy questions, you'd buzz in at "This American architect..."


That's what I did in quiz bowl in high school. Calculated risk, but one that's likely to pay off.


Jeopardy! used to work like this. Players would often buzz in immediately, hoping they would know the answer to the question (sorry, the "question" for the "answer" [1]). Since Alex Trebek still finished reading the clue, and most of the players knew most of the answers most of the time, this was a winning strategy. The producers changed to the current design in 1985. [2]

[1]: This has always struck me as an idiotic gimmick, especially since the wording rarely makes sense. No one would ever respond to the question "Who was George Washington?" with the answer "This man served as the first President of the United States."

[2]: https://en.m.wikipedia.org/wiki/Jeopardy!#CITEREFTrebekBarso...


Couldn't you make it that you can buzz in anytime, but there's no advantage for buzzing first during the question reading (if two people buzz, randomly pick one)?


I think there's more to it than that.

I actually played Watson on the Jeopardy set they filmed the Watson episodes on. That "set" was just the auditorium here at IBM Research in NY, and they left the set intact for about six months after taping. Colleagues of mine hosted an academic conference in our building during that time.

Since Watson was on everyone's mind at that point (and we were literally looking at the set for it during people's talks), my colleagues asked the head systems researcher if he could give a special talk at the conference. He graciously accepted, and offered to do a full demonstration. There was only one volunteer from the rest of the conference - I think that if you're not a native English speaker, and not familiar with American pop culture and history, Jeopardy questions can be mystifying - so I gladly volunteered to play.

To answer the obvious question: I lost. Badly. In fact, not only did I lose, I lost a full-cheating version of Jeopardy where the audience (participants from the conference) shouted out answers, and I buzzed in using answers I liked. Not only that, but some of the answers that the audience was shouting were taken from Watson, looking at the readout it showed of candidate answers. (Some of the people were able to immediately recognize the correct answer, even though Watson was unsure of it. But keep in mind Watson came up with those answers.)

I would have lost to even a decent Jeopardy player, so losing to Watson is kind of overkill. But I think I learned a few things about playing Jeopardy, and how the human experience differs from Watson's experience.

I spent almost all of my cycles when the host was asking the question trying to figure out when he would stop talking. That is, I spent the entire question-asking period timing my buzz. I could not think of the answer and time my buzz simultaneously. The few times I thought I did know the answer, I did not know the answer when I buzzed in. I knew I knew the answer. The difference is that I did not have the answer in mind when buzzing, just that given a few seconds - which you get, by the basic protocols of the game - I could come up with the answer. I think other human players are the same: they spend the question-asking period timing their buzz, and are going by an intuitive feel for whether or not they know the answer. The actual answer recall does not happen until after they buzz in.

Watson is inverted. It only buzzes in when it is confident of its answer. In that sense, the game is not just about Watson winning the buzzer timing. Watson definitely could win every buzzer, but then it would lose the game; it's not accurate enough to get every question correct. So if Watson won every buzzer, it would answer incorrectly many times, and ultimately lose.

With that in mind, I hope it restores some of the romance. During the test run, the category was "Presidential Rhymes" with the clue "Barack's pack animals". Watson answered "Obama's llama's" immediately. Me and the other contestant just stared at each other, slack-jawed. I had seen it done on tv, but there was something far more immediate and impressive, since the head systems designer was ten feet from me, and the Power7 systems just a few rooms away.


That was a fascinating read thank you.

Can you expand on why it was so shocking or maybe impressive by the Obama response. I'm confused what the significance of it is.


It's not that different from other questions. It had to know that Barack meant Barack Obama because of the connection with the presidential category; it had to know that llamas are pack animals; and it had to know that "Obama" rhymes with "llama", and that a category with rhyme in the name probably needs a rhyming answer. I think that's extraordinarily impressive, but standard compared to what it does for other answers.

We were just shocked because it answered before we could even think the question through, and it happened in front of us. It's an irrational thing, but being physically present for the act makes it more impressive.



Good read, thanks.

Can you comment on further Watson developments since that time?


No, as I am a spectator to the technology. I work on other things. (Parallel and distributed systems and languages for online stream processing.)


Do you work with the ibm infosphere streams?


Indeed I do. My contact information is on my webpage, which is in my profile.


Yeah, it definitely wasn't as impressive as it seemed on the surface. A much better test of Watson's performance would have been QuizBowl, where contestants can buzz in while the question is still being read.


They also restricted the clues compared to what they would normally have - no audio/video Daily Doubles. I think there might have been some other restrictions, but I don't recall the exact rules. I think they didn't allow any of the "special" categories they sometimes have.

They also did some other tricks, like giving Watson a parsed ASCII stream of the clue, rather than making it see the screen and parse it out. I think it also got an electronic feed of the "you can buzz in now signal" rather than making it parse out the same light the contestants see.


So, to be fair, they should have added a delay to the "you can buzz in now signal" to match the slowness of the human visual processing system. Without that, if all players knew the answer Watson won every time.

Also, the ASCII feed should have matched human reading speed.


Some detailed info about the timing and the buzzers: http://www.kurzweilai.net/the-buzzer-factor-did-watson-have-...


> And when Jennings went up against IBM’s Watson in a special competition, the computer’s biggest advantage was not its knowledge base but rather an optimized buzzer timing

So the computer's most effective weapon against Jennings was a 555 timer? That's cute.


Only because the rest was already optimized.


> Here’s how we figured this out. We assumed that Jennings headed into Final Jeopardy with at most one other player in contention to win. Then we assumed that in order for Jennings to lose, this sequence of events has to happen: The game must not be locked up, Jennings must get the Final Jeopardy question wrong, and the other contender must get it right. We used his stats across all 75 of his games, not just his 74 wins, to better reflect his overall skills. The probability he doesn’t have the game locked up is 13.3 percent, the probability he gets Final Jeopardy wrong is 32 percent, and the probability the other contender gets it right is 49 percent. When we combine these probabilities, we see that Jennings only has about a 2.1 percent chance of losing.

This is incorrect and overstates his dominance because (1) it assumes Jennings will be ahead, something that won't always happen against Rutter/Jennings/Collins/Chu/Craig level players and (2) the lock up and opponent knowing Final Jeopardy conditions are not independent.

Pretty sloppy basic probability by 538 and make me think I'm not missing much by not subscribing (they didn't offer full RSS last I checked).


When you say that they're not independent, do you mean that the probability that the person knows the Final Jeopardy question is higher because they've managed to keep Ken Jennings from running away with it (i.e. they're better than the average person Jennings has faced)?

Just checking my understanding.


I'm convinced that not only do we stand on the shoulders of giants, but we live among giants. What Ken Jennings accomplished is remarkable by any interpretation. How he did so probably even he can't explain.


Unless you're a computer.


Ken Jennings' 74 win streak is because...?

A. He cheated B. He's lucky C. He's a genius D. It's written


Seems harsh that so many people downvoted this, I dig the Slumdog Millionaire reference!


I believe that is because many people don't want comment section in Hacker News turn into the one in Reddit, which consist of mostly jokes, pop culture references...things that doesn't really contribute to the discussion.


Got it, thanks!


> things that doesn't really contribute to the discussion.

There may be a diversity in palette for this matter, even among the same people at different times.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: