I could have torn off my shirt, kicked over the monitor and yelled about where we were, but I probably would have lost my job. And I like my job.
I think most of the reluctance came from the fact that very few of the conference attendees were Americans and native English speakers. I'm terrible at Jeopardy, but I can still
recognize correct answers a decent amount of the time. Since so many of the answers involve both word-play and American pop-culture and history, some of my colleagues said they didn't even understand the correct answers when they were said. And these are very smart people.
I never knew the answer before buzzing in. Timing the buzz
and thinking about what the answer is at the same time is really, really hard. Hard to the point that no one really does it - good players don't buzz in when they know the answer, they buzz in when they
know they know the answer. That is, they buzz in when they know they can figure out the answer. Then they have a few seconds after a successful buzz to come up with the correct answer. I did this on the one answer I got completely on my own: I recognized I could figure it out, won the buzz, and thought it through afterwards. Watson doesn't do this: it only buzzes in when it is confident of the answer. There were several times during the game when it was not confident enough in its answer to buzz in.
Keep in mind that high-level Jeopardy play is all about the buzzer. When good players are playing against each other, they all know the answers. It's a game of who can buzz in first. Making Watson have human reaction time would actually not work. Human reaction time is about 250 milliseconds. According to Epstein, the best players can buzz in 20 milliseconds after they're allowed to. How is that possible? They anticipate when they will be able to buzz in based on how close Trebek is to the end of stating the question - their brain says "go"
before they're allowed to buzz in. In order for Watson to play like that, it would need to do the same thing humans do: anticipate when it will be allowed to buzz in based on what is being said, and schedule the buzz accordingly. But that was outside of the scope of the Watson project; it does not do any speech processing. It's possible, to do, of course, and I would be interested in seeing it, but it would require a lot more work.
Watson and humans are doing different things. Humans spend the question-statement time figuring out when to buzz in, and then use the few seconds between a successful buzz and being asked to respond to figure out the answer. Watson spends the question-statement time figuring out the answer - it received the question in electronic text form - and only buzzes in when it is confident in its answer.
A lot of the algorithm stuff Epstein talked about is in this paper:
Building Watson: An Overview of the DeepQA Project. They have done preliminary tests with medicine, and it's not quite there yet. But I think it will get there. We saw the confidence Watson had in answers on tv, but we did not see the evidence it used to arrive at those answers and that confidence. Used in a real setting, it can do that. It can give a list of possible answers, its confidences, and the evidence it used. Medicine is the first domain they're looking at, and I think it could be very useful. Because it gives its evidence, humans can always reason through it and see if they agree. If they do, great. If they don't, they ignore it.