Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Theory workshop
neofthemostinterestingavenues ofcomputerscienceisthatof programmingacomputertoplay agameagainstahumanopponent. Examplesabound,withthemostfamousthat ofprogrammingacomputertoplaychess.But nomatterwhatthegameis,theprogramming tendstofollowanalgorithmcalledminimax, withvariousattendantsub-algorithmsintow. First,adei nition:atwo-playerzero-sum gameisoneplayedbetweentwoplayerswhere theplayersplayalternately,thewholegameis visibletobothandtheresawinnerandaloser (ortheresadraw).Itszero-sumbecauseifthe gameisplayedformoney,theloserpaysthe winnerandoveralltheresnolossofmoney. (Abitlikeenergyinareaction:nomoneyis createdordestroyed.) Oneofthesimplesttwo-playerzerosumgamesisnoughtsandcrosses,where theplayersalternatelyplaceXsandOsin
The algorithm
Analysingnoughtsandcrosseswiththe minimaxalgorithmisprettystandardin gametheory,soIlldiscussadifferentgame calledNimtoillustrateminimaxandits variants.Nimisinterestingbecauseitseasily understood,fairlyunfamiliarandsimply modelled.Plus,therearenodrawsinNim,so thewholewinner/loserthingismuchsimpler: someonealwayswins.Butwho? InNim,theplayersfacethreepilesofstones with,say,i vestonesineachpile.Eachplayer 3 291 February 2010 85
PCP291.theory 85
11/12/09 4:08:56 pm
Player 1
5, 5, 5
Max
Player 2
0, 5, 5
Minnie
X O
X O
X O
O X
O X
0, 0, 5
Max
X O
Player 1
0, 0, 0
Minnie
1 Figure 1: The rst few levels in the noughts and crosses game tree.
takesitinturntoplaybyremovingfroma singlepileanythingfromonestonetothe entirepile.Theloseristheonewhoisforced toremovetheinalstonefromtheinalpile, leavingallthreepilesempty.(Anotherwayof lookingatitisthatthewinneristheirstplayer tobefacedwiththreeemptypiles.) Forexample,supposeourtwoplayersare namedMaxandMinnie.Maxstarts(healways does,notbeingagentleman)anddecidesto removeallthestonesfrompileone.Minnie thenremovesallbuttwostonesfrompiletwo. Maxthinksforawhile,thenremovesallbut twostonesfrompilethree.Minnieresigns, becausenomatterwhatshedoes,Maxwill win.(Ifsheremovesonestonefromapile, Maxremovesbothstonesfromtheother,and shesleftwiththeinalstone.Ifsheremoves bothstonesfromapile,Maxremovesonestone fromtheother,leavingherwiththeinalstone.)
Traversing nodes
GamessuchasNimaremodelledasgame trees.Youstartoffwiththeinitialstateofthe gameasanode,therootofthetree.Fromthis node,eachpossiblemoveismodelledasalink toanothernode,whichstandsinforanother stateorpositionofthegame. So,forexample,innoughtsandcrosses, therootnodeistheemptygrid.Traditionally
Xstartsandtherearethreepossiblemoves: thecentre,acornerandthemiddlecellalong anedge(allthecellsareequivalenttooneof thosethree).So,theinitialrootnodehasthree linkstoothergamestates.Eachofthosenew nodeshasdifferentpossiblemovesforO,as showninFigure1.Youcanimaginegoing furtheranddrawingmorelevels. Nimstreeismorecomplex.Theinitial statehas15possiblelinks,correspondingto removingone,two,three,fourorivestones fromeachofthethreepiles.Eachofthese15 possiblestatesofthegamethenhasupto14 possiblelinkstootherstatesforthesecond player,andsoon.Youcanimaginethatthe numberofgamestates(thatis,nodesinthe gametree)explodesprettyquickly. Ifyouhappenedtohaveabigenoughpiece ofpaper,itwouldbepossibletomapoutthe entiregametreefortheversionofNimthatI described.Fortheleafnodesofthetree(that is,thenodeswithnolinkscomingoutofthem), youwouldbeabletoidentifytheloserofthe gameforthepathtakenthroughthetree toeachparticularleaf.Figure2showsa particularlydaftpaththroughthetreewhere theplayerstakeallthestonesfromeachpile inturn(notexactlyaninsightfulgame,but neverthelessapossibleoneundertherules).
1 Figure 2: An allowable but idiotic game play for Nim, resulting in Max losing.
TheloserisMax,becausehetakesallthe stonesfrompilethreeinthethirdmove. Wecanassignavaluetoeachleafnode toindicatewhowins(orloses).Tomakesure wedontgetcompletelyconfused,weassigna monetaryvaluefromtheviewpointoftheirst player,Max.Letssaythewinnerofthepath totheleafreceives1andtheloserhastopay outthatamountsoifthewinnerisMax,the valueofthenodeis1,whileifthewinneris Minnie,thevalueis-1(sinceMaxhastopay thatamounttoher).
Player one
Letsimaginethatwesetuptheentiregame treefromtheviewpointofMax,theplayer whomakeshismoveirst.Eachgameposition correspondstoanodeinthetree,andifyou thinkaboutit,awholelevelofthetreewill correspondtoagivenplayer.So,therootof thetreeiswhatMaxisfacedwithatthevery startofthegame:ivestonesineachofthe threepiles,and15possiblegamepositions toleaveforMinnie.WhatdoesMaxchoose toplayinthissituation? Whatheshoulddoisanalyseallpossible movesfromthebottomupandassignavalue toeachnodeasheworkshiswayupthetree,
Claude Shannon
In 1950, Claude Shannon published a paper called Programming a Computer for Playing Chess, which was the rst such paper to consider this particular game tree. In it, he reached an upper bound for the number of nodes in a game tree for chess to be about 10120, which meant, as he put it, that a machine operating at the rate of one [node] per micro-second would require over 1090 years to calculate the rst move. Shannons paper was remarkable because it contained the insight of an evaluation function for the strength of a position, and using it to be able to calculate node values to several levels deep. n
86
PCP291.theory 86
11/12/09 4:08:56 pm
Expectiminimax
Games such as Nim and chess have outcomes that are solely dependent on the skill of the players, or, presumably, their access to a wellwritten minimax analyser. Games such as backgammon are dierent, because their outcomes also depend on a randomisation factor such as the roll of a dice. The minimax algorithm has been expanded to suit such games (leading to the expectiminimax algorithm) by including what are known as chance nodes that incorporate the expected value of the randomisation agent (for backgammon, this would be the dice). n
5 4 3 2 1 0
L L L W W W L W
Max
3 1
L L L
L L L
2 0
W
Minnie
W Max
2 0
L
2 1 0
W W
1
L
1 0
L L
1
W
1 0
W W
accordingtotheamounthecouldwinonthat nodeifheplayedoptimally. Letstakealookatamade-upexample, showninFigure3.Here,therootnodeshows agamepositionfromwhichMaxmustplay. Therearetwopossibilities:playingthe left-handoptiongoestoagamepositionthat hesalreadyworkedoutmeanshewins1; playingtheright-handoptiongoestoagame positionwhereheloses1.(Remember,all payoutsarefromMaxsviewpoint.)Idont knowaboutyou,butIdchoosetheirstplay. Thismeansthatthecurrentgameposition alsohasavalueof1.Foreverygameposition whereitshisturntoplay,Maxwouldchoose theoptionthatwouldmaximisehiswinnings. Minnie,whoisjustasperceptiveasMax, would,ofcourse,chooseplaysthatwould resultinthebestresultforherandignoreall theothers.Soshewouldalwayschooseaplay thatmaximisedherwinnings,which,from Maxsperspective,meansminimisinghis. Ifyouhadtheentiretree,youcouldwork outavalueforeachnodeworkingfromthe bottomup.IfitwasaMaxnode(thatis,Max hadtoplayfromit),itwouldhaveavaluethat wasthemaximumofthechildnodes.Ifitwas aMinnienodeitwouldhaveavaluethatwas thesmallest(theminimum)ofthechildnodes. This,inessence,istheminimaxalgorithm: buildthetree,workoutthevalueofeach nodeusinganalternateminimise/maximise constraint,andthevalueoftherootisthe valueoftheentiregameforplayerone(Max, aswecalledhim).
(anddestroythestuffyoudontneedwhen youredone).Inessence,sinceatreeisdeined recursively,youcalculatetheminimaxvalue bycalculatingthemaximum(orminimum)of theminimaxesofallthechildtrees.Remember thatthelevelsalternatebetweenmaximising andminimising(sometimesyoulookatitfrom MinniesviewpointinsteadofMaxs). Figure4showsaverysimpliiedNimgame (onepileofivestones,youcanremoveone, twoorthreestoneseachplay),fullyexpanded intoagametree.Thenumberinsideeachnode isthenumberofstonesleftinthepileafterthe move,andtheletteralongsideeachnodeisthe minimaxvalueforMax(W=win,L=lose). NotethatthevalueofthegameisLthatis, Maxwillalwayslose(ifyoulike,thissimpliied Nimisalwaysawinforthesecondplayer). Althoughtheminimaxalgorithmisalways guaranteedtoindthebestplayforMax,there isabigproblem.Thegametreecanbehuge mind-bogglinglyhuge.Considerchess,the classicarchetypeofatwo-playerzero-sum game.Ateachgamepositiontherecouldbe somethinglike30possiblemoves.Sinceeach chessgameismadeupofabout80plays(40 back-and-forths),itwouldmeanthatthe lowestlevelofthetreewouldhavesomething like10118nodes.(Notethatintournamentsits rareforagametogotocheckmatethelosing playerislikelytoresignwellbeforethen.)Asa comparison,therearearound1080atomsinthe observableuniverse,meaningthat,inessence, theresnopossiblewayforacomputertomap theentirechessgametree.Sowhatcanwedo? Theirstoptimisationistolimitthedepth towhichweevaluatethegametreeusingthe minimaxalgorithm.Sincewemaynotactually
reachaleafnodeindoingthis,wemakeuse ofanapproximationfunctionaheuristic toapproximatethevalueofthenodeorgame position.Ofnecessity,thisvalueisnotgoing tobeaccurate,butitwillenableustoapplythe minimaxalgorithmwithouthavingtoevaluate allthenodesdowntotheleaves.Thebetterthe heuristic,thebetterthechancesofdevisinga winninggameplayandthemoreaccurateour minimaxvalueswillbe.
Limiting depth
Inourrecursivealgorithmforminimax,well needtolimitthedepthoftherecursioninstead ofallowingtherecursiontoreachtheleaves. Thesimplestwaytodothisistopassadepth parametertotherecursiveminimaxfunction anddecrementitsvalueateveryrecursivecall. Atthelowestleveloftherecursion,weusethe heuristicfunctiontocalculatetheminimax valueofthecurrentgameposition. Now,theresultingminimaxvalueatthe rootofthegametreeisonlygoingtobean approximation.Thedeeperweallowthe partialminimaxalgorithmtogo,themore accurateitsvaluewillbe(becauseweremore likelytoindleafnodesinourtraversal),but thelongerthetraversalwilltake.Wehaveto strikeabalancebetweenaccuracyandthe timetakentocalculatetheminimaxvalue (andhencethemovetoplay). Onceitsourturnagaintomakeamove,we shouldrecalculatetheminimaxvalueatour newgameposition,makingit,ineffect,the rootofthecurrentstateofthegame.Every movewouldbemadeafteranewminimax calculationbasedonthecurrentgamestate. Inmanychessprogramsthatrunon standardPChardware,thedepthofthe minimaxsearchislimitedtosomesix full-widthlevelsaroundabillionpossible gamepositions.Anymorethanthatandthe timetakentoanalysethegamepositions wouldbefartoolongtobepractical.For example,analysingpositionsatarateofa millionpersecond,sixfull-widthlevels wouldtakeaboutaquarterofanhour.n Julian M Bucknall has worked for companies ranging from TurboPower to Microsoft and is now CTO for Developer Express. feedback@pclus.co.uk 291 February 2010 87
Chess programs
Computer chess is one of the most fertile avenues of game research. Modern chess programs running on standard hardware use sophisticated pruning techniques to cull unprotable areas of the game tree, use advanced heuristics to evaluate a game position and have large databases of standard opening games (so that they dont have to analyse an opening game at all) and common endgames (so that they can more easily force checkmates without having to analyse a multitude of game positions towards the end of a game). n
Max
Win 1
Lose 1
Minnie
1 Figure 3: A simple choice in a game tree, to calculate the minimax value of the root node.
PCP291.theory 87
11/12/09 4:08:56 pm