Watson vs. Humans: A Jeopardy Champ’s View

As an olde-time Jeopardy champ, I have a few choice thoughts on next week’s big contest pitting IBM’s computer Watson against two humans, former champs Ken Jennings and Brad Rutter.

I am also ready to predict the winner.

Let’s get to the thoughts. 

1. Yes, a computer can beat a human at Jeopardy!  Let’s not pretend it’s even a question any more. If it doesn’t happen here, it will happen a year from now, or three. That’s progress, wonderful progress. Way to go, IBM!

2. Don’t forget Brad Rutter. Ken Jennings gets all the press, and he deserves it. He’s a nice guy, too.

But inside Jeopardy! circles, Brad Rutter is just as respected a player. He beat Jennings head-to-head in the finals of the Ultimate Tournament of Champions in 2005, after all.

Rutter has also earned more at Jeopardy! than Jennings or anyone else: $3,255,102 and two Camaros. (For a few years they gave out cars to five-time champs.) Jennings is wayyy back at $3,022,700 and zero Camaros.

The show is pitching them as “Jeopardy‘s two greatest champions,” which seems about right. In any case, don’t think this contest is just Watson vs. Jennings.

3. How will Watson handle Final Jeopardy? This seems like a place where the humans have the edge. 

Watson can do the wagering math, obvs.  But how will he be programmed to strategize in a two-game contest where scores are added together? Presumably his programming includes a chess-like program that can run millions of possible betting outcomes based on current scores, questions remaining, and so on. (Stephen Baker, author of an upcoming book on the Watson project, says the betting strategies were “one of the easier things” to teach the computer.)

But here’s the rub: humans act crazy in Final Jeopardy, and I’m not sure Watson will be able to account for that.  They may bet it all. They may bet zero when they should bet it all.

During the 1995 Tournament of Champions I can remember pondering, the night before, how to bet at the end of day one in a two-day match. I decided that only two kinds of bets made sense: very small or very large. If you bet a medium amount and got it wrong, you would effectively kill your chances on day 2 anyway, so that made no sense. If you’re going to go that far, you should go all the way and so win more if you get it right.  Bet small or bet it all, but don’t bet medium.

The next day, in the actual event, I naturally bet medium.  $3003 out of my $7100. In fact, all three of us bet medium that day. I got it right (thank you, Hippocrates!) and it turned the match.

My analysis was sound, but logic didn’t prevail in the moment — I went with my gut. (There’s a notion that launched a thousand Star Trek episodes.) Watson should have an edge with logic, especially in a two-day match, but I think it’s likely the humans end up confounding him with crazy betting instead.

Final Jeopardy answers also tend to be quirkier, more word-playish, and just plain weirder than regular answers. Watson will have more time to think about them, but he’ll have to think harder and the penalty for getting it wrong will be much higher.  He will also be forced to answer — he won’t be able to lay out if his confidence is low.  I say advantage: humans.

4. Watson may be better at the bottom of the board.  $2000 answers on Jeopardy! are “hard” for humans because the topics are typically farther from our mainstream.  We don’t talk about Imre Nagy and the Hungarian Revolution of 1956 on dinner dates very much, so he’s harder to remember. 

But Watson doesn’t go on dinner dates and his huge memory banks don’t “remember” things just because they’re common knowledge. Imre Nagy should be roughly as accessible to him as the capital of Colorado. (Detroit.) If so, that’s a big edge.

Jennings and Rutter are stone-cold trivia killers themselves, of course. If these are “standard” game boards, not tricked out for the occasion, they’ll be better-than-average at the bottom of the board, too. But I say: slight edge to Watson. 
5. The buzzer is the elephant in the room. The buzzer will play a colossal hidden role in this match, as it always does, though most people at home don’t realize it.  And this may be a place where Watson has the edge.

A little backstory on how Jeopardy! contestants answer: Back in the Art Fleming days of Jeopardy!, contestants could buzz in as soon as they thought they knew an answer. At the buzz, Art would stop reading the clue aloud and the contestant would have five seconds to get the question.

It didn’t take contestants long to realize that if you knew a category well, you could buzz in as soon as the clue was revealed, read it super-fast to yourself, and then answer before five seconds ran out. A little risky, but worth it.

Jeopardy! eventually put the kibosh on this. Not because it wasn’t fair, but because it made things choppy and frustrating for home viewers. They wouldn’t get to hear the whole clue, couldn’t read it fast enough, and wouldn’t get time to guess for themselves before the contestant blurted it out.

Home participation is essential to the success of any game show, so this wouldn’t do.

The Jeopardy! solution was simple enough: contestants today can ring in only after Alex Trebek has read the entire clue aloud. In the studio — and not visible at home — is a special light that comes on as Alex finishes reading and alerts contestants that the buzzers are now “armed.” Once the light comes on, the first one to buzz wins the right to speak. 

Contestants who buzz in before the light comes on are punished by being locked out for a crucial split-second, which hurts.

Every contestant develops his own system for buzzing in. There’s a certain amount of anticipating that goes on, the way sprinters in the 100-meter dash try to “jump the gun” without actually triggering a false start.  Some contestants use thumbs, others (like Cheech Marin!) swear that fingers are faster. And so on. But everyone has to deal with the human frustrations of trying to beat the buzzer.

Now enter Watson.

IBM has posted a nice clear explanation of how the computer buzzes in.

“At exactly the moment the ‘Buzzer Enable’ light is activated, Watson’s system receives a signal that the buzzer is open… If his confidence is high enough, Watson may decide to buzz in. To do this, Watson sends a signal to a mechanical thumb, which is mounted on exactly the same type of Jeopardy! buzzer used by human contestants. Just like Ken and Brad, Watson must physically depress a button to buzz in…

The best human contestants don’t wait for, but instead anticipate when Trebek will finish reading a clue. They time their ‘buzz’ for the instant when the last word leaves Trebek’s mouth and the ‘Buzzer Enable’ light turns on. Watson cannot anticipate. He can only react to the enable signal. While Watson reacts at an impressive speed, humans can and do buzz in faster than his best possible reaction time.”

I’m not really buying this explanation.  Sure, humans can anticipate and get lucky once in awhile, but that’s not a winning strategy. You just can’t “guess right” often enough. 

The real question is how a computer fares against humans when both wait for the light to come on. And in that case, the computer seems to have a big edge. With his nearly speed-of-light circuits, Watson should be faster to receive the “buzzers armed” message and then faster to react and press the button. Human retinas and synapses and muscles are fast, but not as fast as computer circuits.

Emotionless Watson also won’t get frustrated when he gets beat to the buzzer, won’t get jumpy, and so on. To go back to the 100-meter dash analogy: who’s going to win the most races if one racer has the fastest possible start every time, while the rest of the field has a mix of superfast starts, normal starts, and false starts?

Perhaps the Jeopardy! and IBM teams have built in some kind of human-like delay to solve this problem. But I haven’t seen it. In fact, Engadget has posted video of a practice round last month that seems to indicate the opposite.  

The in-studio screen shows Watson’s confidence in his answers (starting at 0:47 and again at 2:35 — very cool!). In this test game, every time the bar is green, indicating confidence high enough to answer, Watson gets in first.

There are times when he’s not confident and doesn’t answer, and times when we don’t see the confidence level, but every time we do see Watson at green confidence, he buzzes in first.

That’s a very tiny sample, yes. They could also have cherry-picked this moment to show us. But the buzzer is a huge part of the game, and it sure seems like Watson will have an edge there. 

Enough of an edge to maybe give him the win? Sure.

The buzzer, unfair? Hey, that’s Jeopardy!

6. The Prediction

I have no idea how Jennings and Rutter will do against each other. But as for humans vs. computer, here’s how I see this three-day, two-game match playing out.

I see Watson winning a majority of the buzzes because of his electronic edge. Let’s call it a three-way split of 41 / 32 / 27 percent overall. I see Watson winning a bit more heavily lower on the board.

I see Watson leading both days going into Final Jeopardy.  

And finally, I see either Rutter or Jennings winning the match through the all-too-human stratagems of crazy aggressive betting and random moments of genius that turn the tide. Watson will struggle with Final Jeopardy. I predict final three-day tallies of $49,800 for the human winner, $36,000 for “second human,” and Watson in third with $28,300.

Better luck next time, Hal!

[ Update: Edited to reflect the correct format of two games over three days. Thanks, Robert K. Schmidt! ]


