Did anyone else check out the first part of the "humans vs a computer" stunt yesterday?
During the first set of questions, I thought this was the greatest advertisement for IBM ever. (Even better than Detlef Schrempf.) Watson was smoking the past champions badly. It didn't look like it'd be a fight at all. There were a lot more chinks in the armor in the second segment and Brad came back to to him. It's a game.
All the talk was about understanding human wordplay and figuring out clues, but a lot of Jeopardy always comes down to anticipating the end of the clue and pressing the button at the exact earliest instance. If you didn't watch, Watson does has to press the same button as humans, but it's not shown on camera. Also, he's reading the clue from a text file, not actually hearing Trabek, which makes the timing tricker. He was beating the humans to the buzzer early, but they got on runs later.
Watson almost always knew the right answer, but it was surprising how completely off it's second and third choices. For a category where the answer is a decade (number), the second/third could be random words. It also appeared Watson could not hear/read the responses of his opponents, because he repeated a guess once.
I wish they would've gotten to the game a bit quicker; I probably would watch a special about the computer (and it seems they might have done one), but I was there for the answers in a form of a question. Things should pick up for the next two days.
That is a lot of computer right there. The shot of Trebek in the server room made me think someone will use that shot in about 10 years to show how "All of this computing power now fits in a device the size of a USB stick."
Definitely was intrigued in the second half of the first round when Watson could not determine a clear answer from its three highest scores. I recall seeing some of the answers resulted in low scores for all three question options Watson listed.
I will be interested to see if any strategies emerge as the game progresses. Looked like the guys were going for the high $ questions early on, couldn't necessarily detect a strategy for Watson yet.
"As you may have read in Robert Parker's Wine Newsletter, 'Donaghy Estates tastes like the urine of Satan, after a hefty portion of asparagus.'" Jack Donaghy, 30 Rock
Loved the actual competition, hated that there was so little of it. If you're a regular Jeopardy viewer, you've seen a lot of the development stuff already. (Though, clearly they knew the audience would be made up mostly of people who WEREN'T regular viewers.)
I still think it's fishy that Watson found the Daily Double right away. There's no money advantage to it, but it does take the DD away from his opponents.
I was really boggled by Watson's thinking on some questions.
I thought Watson would destroy the Decades and Alternate Meanings categories, because there is really no cute contextual trickery. They were just straightforward, fact-based clues.
Also, I don't believe Watson buzzed in on the Voldemort clue, but its first choice was Harry Potter. It looked like it was doing simple word association. Made me wonder if Watson was fed the category names?
I wish Jeopardy would do more things like this and drop the dumbass college weeks. When Ken Jennings was dominating, they would interupt the runs for weeks at a time.
Originally posted by thecubsfan Watson almost always knew the right answer, but it was surprising how completely off it's second and third choices. For a category where the answer is a decade (number), the second/third could be random words. It also appeared Watson could not hear/read the responses of his opponents, because he repeated a guess once.
Well, his guess and Ken's were slightly different. Ken said "the twenties", Watson said "the nineteen twenties". We know that that's the same thing, but perhaps Watson doesn't. We shall see.
I also wonder whether Watson can be asked to be more specific. When he said "leg", a "more specific" prompt seemed appropriate.
Originally posted by thecubsfanIt also appeared Watson could not hear/read the responses of his opponents, because he repeated a guess once.
I read somewhere (can't find it at the moment) that Watson does not factor in answers given by its opponents. It is a feature the IBM folks are hoping to incorporate at a later date.
Edit: This isn't the article I initially read (arstechnica.com), but it does discuss this issue:
During a commercial after Watson's decade gaffe, Welty noted that the team thought the ability to process other players' wrong answers would be unnecessary. "We just didn't think it would ever happen," Welty said, laughing.
We seriously need Teapot's response on this. I saw where all the past players played against Watson in the practice. As a 5 time winner, I wonder if Teapot was involved, and whether he just can't say or something until after tomorrow.
Clearly, Watson kicked it in today.
We'll be back right after order has been restored here in the Omni Center.
That the universe was formed by a fortuitous concourse of atoms, I will no more believe than that the accidental jumbling of the alphabet would fall into a most ingenious treatise of philosophy - Swift
Was I the only one who was somewhat freaked out when Watson said "I'm not sure, but I'm going to guess..." on the Daily Double question? I know that's probably just what they programmed it to say when none of its top answers reach the threshold of certainty, but it was just a weird thing to hear from a computer.
"The object of persecution is persecution. The object of torture is torture. The object of power is power. Now do you begin to understand me?"
Originally posted by AWArulzWe seriously need Teapot's response on this. I saw where all the past players played against Watson in the practice. As a 5 time winner, I wonder if Teapot was involved, and whether he just can't say or something until after tomorrow.
Clearly, Watson kicked it in today.
The games they used to tune the machine's performance were played against progressively higher-caliber opponents. Originally, they used IBM employees, and when they enlisted Sony's participation, they used losing contestants, one-day champions, multi-day champs, and so on, with the final tuning matches using ToC semifinalists.
I imagine those who played in the tuning matches had to sign all kinds of NDAs.
Originally posted by Peter the HegemonI also wonder whether Watson can be asked to be more specific.
It can. I have heard that Alex originally accepted Watson's response of "leg" but was overruled by the judges, and they retaped that part of the episode. (Something similar happened in an game I played.)
Originally posted by SchippeWreckI still think it's fishy that Watson found the Daily Double right away. There's no money advantage to it, but it does take the DD away from his opponents.
That's actually the most likely box for a Daily Double on the board. Neutralizing the DDs is actually pretty smart strategy for somebody who's pretty sure they can win the buzzer war (and, as we've seen, that certainly applies for Watson).
Originally posted by thecubsfanThat was weird. The wager amounts were amusing. But it turned into a thrashing today.
It certainly did, to the point where I wonder if the 3-day schedule was devised after the taping so that people who watched the first day wouldn't think it was a total blowout and give up on it.
Gotta wonder about the Final J! answer, though. It seems to me that it would be fairly easy to program for category names--if the category is "US Cities", eliminate any answer that is a non-US city. (Although note that the category COULD be US Cities and have an answer that is something other than a city name.) I wonder if they intentionally did not program that--I guess that must be the case. The program is pretty damn impressive, but it clearly has some gaps--for now.
Originally posted by thecubsfanThat was weird. The wager amounts were amusing. But it turned into a thrashing today.
It certainly did, to the point where I wonder if the 3-day schedule was devised after the taping so that people who watched the first day wouldn't think it was a total blowout and give up on it.
Gotta wonder about the Final J! answer, though. It seems to me that it would be fairly easy to program for category names--if the category is "US Cities", eliminate any answer that is a non-US city. (Although note that the category COULD be US Cities and have an answer that is something other than a city name.) I wonder if they intentionally did not program that--I guess that must be the case. The program is pretty damn impressive, but it clearly has some gaps--for now.
Originally posted by A Smarter Planet blogHow could the machine have been so wrong? David Ferrucci, the manager of the Watson project at IBM Research, explained during a viewing of the show on Monday morning that several things probably confused Watson. First, the category names on Jeopardy! are tricky. The answers often do not exactly fit the category. Watson, in his training phase, learned that categories only weakly suggest the kind of answer that is expected, and, therefore, the machine downgrades their significance. The way the language was parsed provided an advantage for the humans and a disadvantage for Watson, as well. “What US city” wasn’t in the question. If it had been, Watson would have given US cities much more weight as it searched for the answer. Adding to the confusion for Watson, there are cities named Toronto in the United States and the Toronto in Canada has an American League baseball team. It probably picked up those facts from the written material it has digested. Also, the machine didn’t find much evidence to connect either city’s airport to World War II. (Chicago was a very close second on Watson’s list of possible answers.) So this is just one of those situations that’s a snap for a reasonably knowledgeable human but a true brain teaser for the machine.
Also, Watson's potential approach to the question illustrated how differently humans' might be. I started out by trying to name cities with two airports, and then disqualifying the names of airports that didn't fit the clue. I got to an answer quickly that way.
Also, as a viewer, let me tell you how stupid I feel when my answer guesses aren't even one of the top three Watson considers. Now I'm three times as wrong.
They showed Watson's clicker trigger, and I wonder how likely it is that such successful competitors might lock themselves out by buzzing too early. And let me again state how much I despise that rule.
"To be the man, you gotta beat demands." -- The Lovely Mrs. Tracker
I must say I'm unimpressed by Watson, because the company where my wife works has a much similar product, but at a fraction of Watson's cost to build, and free for anyone to use.
Originally posted by A Smarter Planet blogHow could the machine have been so wrong? David Ferrucci, the manager of the Watson project at IBM Research, explained during a viewing of the show on Monday morning that several things probably confused Watson. First, the category names on Jeopardy! are tricky. The answers often do not exactly fit the category. Watson, in his training phase, learned that categories only weakly suggest the kind of answer that is expected, and, therefore, the machine downgrades their significance. The way the language was parsed provided an advantage for the humans and a disadvantage for Watson, as well. “What US city” wasn’t in the question. If it had been, Watson would have given US cities much more weight as it searched for the answer. Adding to the confusion for Watson, there are cities named Toronto in the United States and the Toronto in Canada has an American League baseball team. It probably picked up those facts from the written material it has digested. Also, the machine didn’t find much evidence to connect either city’s airport to World War II. (Chicago was a very close second on Watson’s list of possible answers.) So this is just one of those situations that’s a snap for a reasonably knowledgeable human but a true brain teaser for the machine.
The thing is, the Final Jeopardy categories are NEVER tricky. They are straightforward: U.S. Cities, 20th Century Literature, Presidential Cabinet Members, etc. It seems like Watson should be programmed to give more weight to the Final Jeopardy category.
(edited by SchippeWreck on 16.2.11 1156) "It's magic! We don't need to explain it!"