AlphaGo

While I have been reading instead of playing, the most exciting news in computer gaming has been Go. Chess-playing computer programs have gradually moved from “plays a standard game pretty well” to “almost competitive with a good human” to “consistently beats world champions,” reaching the end of that progression about a decade ago. Go, contrarily, has long been held out as a game at which computers will have trouble making gains because the search space is huge for a 19×19 board, moves have long-term consequences that make evaluating individual moves difficult, and play has generally been seen as more intuitive and so less open to computational brute force.

A year ago, the best Go program was competitive against a good amateur player. In the last six months, Google Deepmind’s AlphaGo has leaped to “best in the world,” beating the European champion 5-0 and now beating the world champion 3-0 with two games to go.

frame from a manga. two young men face a computer. one says, "but they say it'll be another hundred years before a computer can beat a human at go." the one at the keyboard replies, "I don't need a hundred years." There are three things I would like to note here. First, the speed of that jump is ridiculous. Go has long been one of those “at least a decade away” computing problems, like the ones that have been forecast as “20-30 years away” for the last 20-30 years and are still 20-30 years away today. AlphaGo is the first computer program to beat a professional player without a handicap, and then it went on to beat the world champion. That is going from “can’t beat a professional player” to “beats the top professional player” in one step. This is not the gradual progress we saw with computers and chess over decades, this is an escalation in power levels that would make anime blush.

Second, this is not simply a matter of Google having massive computing power to throw at the problem. The chess world champions play on supercomputers and evaluation trillions of positions per second. The world champion version of AlphaGo uses a distributed computing network, but they also have a single-computer version that beats the distributed version about a quarter of the time. We will see if the human world champion gets one win in the series, but this suggests that a much less powerful version of AlphaGo would still be a top player.

What I find most interesting is that humans seem to be fairly bad at evaluating how AlphaGo is doing. AlphaGo optimizes for probability of winning, not its current score or a projected score at the end. So the human analysts are commenting on how the computer seems to be making mistakes, that it is not capturing territory, and oh look gg the computer has somehow gotten itself into an unassailable position. One of the reasons computers have been bad at Go is that a single move now can have subtle implications 50 moves later; AlphaGo has made the jump to where its subtle moves look like mistakes to observers until it wins. It is probably not the case that the computer was playing a close game and pulled ahead in the late game. It seems more likely that the computer was steadily pulling ahead but in a way that is not obvious until the late game. Here is Eliezer Yudkowsky exploring this point at length. Bonus thought: human commentators were probably assuming that AlphaGo would lose, so odd-looking moves were probably mistakes rather than subtle brilliance; in light of consistent wins, I am curious if the human commentators will now look more closely at its moves for hidden strengths, rather than starting with the frame “this is another lousy computer Go program.”

: Zubon

Bonus thought 2: when I see Eliezer referencing “Path to Victory” in that post, I cannot help but see him referencing Worm, which he has read and commented on before.

One thought on “AlphaGo”

Comments are closed.