Skip to content

154. Recreating the Threes game app – part 3

July 15, 2014

For me, this is why I started this project. I want to know whether I’m making the best moves, so I’m going to get my program to play out future moves to see what works best.

If you want to see how to include some simple AI into board games, this might interest you. (I love to see a game playing itself).

What I want to do

Given a board with tiles, I want to know which of the available moves is best – it is trivial to figure out which single move maximises the score, but it may be that if you made a different move now, then it increases your score three moves from now.

So I want to look ahead a few moves, and see what maximises my chances of a big score.

Threes doesn’t make this easy, because

  1. the score can jump dramatically if you combine two large numbers, so it doesn’t increase gradually, as in many games
  2. there is a random element, since new tiles are added randomly, and we don’t know what they are going to be

The second point means that we can’t simply play out every combination of moves going into the future, because there is a random new tile every move, and if we don’t know what it is, we can’t use this approach.

This was the problem facing the people who tried to write a backgammon program many years ago – how to deal with the randomness caused by throwing dice? They used an approach known as “monte carlo” (after the casino) simulation. It was so successful, it has been used many times since, in games of chance.

Monte carlo simulation

It’s a fancy name, but really, very simple. I’ll demonstrate with the “Monty Hall” problem:

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats.

You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat (he never opens the one with the car). He then says to you, “Do you want to pick door No. 2?”

Is it to your advantage to switch your choice?

The answer (“always switch”) is difficult for many people (even mathematicians)  to believe, but a computer program is a good way of proving it, like so.

function setup()
    sims=10000  --number of simulations
    SwapWins=0 --number of wins if we always swap
    DontSwapWins=0 --number of wins if we never swap
    for i=1,sims do
        --host secretly chooses door 1-3 with car
        Car=math.floor(math.random()*3)+1
        --contestant makes a choice 1-3
        Contestant=math.floor(math.random()*3)+1
        --host picks a door with a goat to open
        --this can be any door that doesn't contain the car and is not
        --the contestant's choice
        for i=1,3 do
            if Contestant ~= i and Car ~=i then
                OpenDoor = i --open this door
                break
            end
        end
        --figure out who wins if we swap, or don't swap
        --it's easiest to figure out who wins if we don't swap
        if Contestant == Car then
            DontSwapWins=DontSwapWins+1
        else --if not-swapping loses, then swapping wins
            SwapWins=SwapWins+1
        end
    end
    print ("Don't swap",DontSwapWins/sims)
    print("Swap",SwapWins/sims)
end

If you run this, you’ll see you win 2/3 of the time by switching. Try it on your friends, many of them are likely to guess wrong!

However, the point is that by playing out a lot of games, we can get a feeling for what move is likely to work best, even if we don’t know exactly what will happen in the game we are playing. All we can do is maximise our chances of doing well.

And monte carlo simulation simply means using random numbers as a substitute for rolling dice, choosing playing cards, or choosing random tiles in Threes.

Thinking about Monte carlo simulation in Threes

We start with a board containing tiles, and knowing what the next tile is going to be. We can also figure out which of the four possible moves are valid.

Then we can play each of those moves in turn and see what score we get. We don’t need any random numbers to do that.

But suppose we want to look three moves ahead. If (to keep it simple) we are always able to move in any direction each time, then for our first move, there are four possibilities. For our second move, there are also four possibilities for each of the results from the first move, ie 4×4=16 in all, and similarly,  our third move has four moves for each of the 16 results from move two, ie 4×16=64 in all. So we need to calculate the score for 64 different results after move three.

Don’t forget we also need to choose new tiles to add to the board after each move. We know which tile will be added after move 1, because it is showing, but we need to randomly choose a tile for moves two and three.

And this is why we need to do this more than once, because maybe a new tile with 1 makes a big difference, but 2 or 3 are really bad. If you only run through once, and then get a different tile when you actually play the moves yourself, you will do very badly, but if you run through the 64 moves over and over again, you can figure out which move gives the best result on average.

At this point, please note something important. If we simulate moves to a depth of 3, it doesn’t mean we are trying to choose the next 3 moves. We are always just trying to choose the next best move (ie just one move), but we look ahead to see which (one) move has the most effect in future.

Obviously, the more simulations, the better, but you need to balance it against depth, because Codea has limited power, and you don’t want this to be too slow.

For example, the default in my program is a depth of 3 and 50 simulations. This requires (4^3) * 50 = 3,200 moves (assuming all four moves can be made each time).

If you double the simulations, you need (4^3)*100 = 6,400, or double the previous number.

If instead, you just went one move deeper, you would need (4^4)*50 = 12,800, or 4x the previous number. So going deeper is quite expensive! Clever programmers find ways to get around this, by stopping if the score is obviously bad, but it is difficult for us to do that, because the score in Threes can jump dramatically, and if you can combine two big numbers either this turn or next turn, you don’t want your program to be greedy and always do it as soon as possible. So I haven’t put any of that clever stuff in.

A side note – the number of choices per move (up to four in our case) is called the branching factor. Chess has a branching factor of about 35, while Go is about 250. That explains why computers can’t beat humans at Go yet!

And one thing is for sure – if you are going to do monte carlo simulation, you’d better make your program run as fast as possible!!!

Implementing simple monte carlo simulation

I want to include two options in my program

  • get a hint for my next move
  • play the game out to the end (“autoplay”)

So I’ll include parameters that let the user vary the depth and number of simulations, a button to get a hint, and a button to play the game to the end.

I’ve added a function to give hints

--this returns the best move to make next, ie 1,2,3 or 4
function GetHint()
    h=RunSims()  --still to be written, see below
    --return move (1,2,3 or 4) and put text in Hint textbox
    --(moveText is a table with text for each of the 4 moves)
    if h then Hint= moveText[h] else Hint="No hint available" end
    return h --the simulations will use this move
end

And now the function that does the simulations, see numbered comments underneath

function RunSims()
    simulating=true --[1]
    local bestTest,bestScore=0,0
    currNextTile=NextTile --[2]
    m=CheckAvailableMoves(board) --[3]
    for j,i in ipairs(m) do --[4]
        local testScore=0
        for s=1,SimsPerMove do --[5]
            maxTest=0
            --we need to start each sim at the current position
            NextTile=currNextTile --reset next tile to the one currently on screen
            local b=CloneBoard(board) --[6]
            local mov=MoveCells(moves[i],b) --make the first move
            TestMove(b,1,mov) --[7]
            testScore=testScore+maxTest --[8]
        end
        if testScore>bestScore then --keep track of which of the initial moves has the best score
            bestScore=testScore
            bestTest=i
        end
    end
    NextTile=currNextTile --[9]
    simulating=nil
    --return the best move
    if bestScore>0 then return bestTest else return nil end
end

[1] -as you will see, we’ll use this to stop MoveCells from setting the game state to END. If you are simulating several moves ahead, sometimes you may run out of moves, and MoveCells would change the game state if we didn’t tell it not to.

[2] – We’ll be doing lots of simulations which will use all our previous functions, changing the value of the next tile over and over. We need to put the current value (of the next tile) back when we’re done, so store it safely before we start

[3] – get the available choices for the first move

[4] – iterate through these moves. We are ultimately choosing between these options.

[5] – do it for each simulation

[6] – make a copy of the board table, because making moves will change it, and we don’t want to mess up our original board table. CloneBoard is a simple function that loops through and copies all the table values to a new table

[7] – this is the really important function that keeps making moves until we reach the right depth

[8] – when TestMove finishes, it will have run through all the moves to the desired depth, and calculated the maximum score at that depth for all the different results (eg if depth is 3, then there are up to 64 different results). We add this score to the total, so that at the end, we have the total of the maximum score from all the simulations

[9] – put everything back the way it was, turn off the simulation, and return the best move

Move recursion

When it comes to things like games “recursion” is a valuable technique. It simply means a function that calls itself over and over.

Here is a simple example, that adds the numbers from 1 to any number we choose.

function AddNumbers(n,max)
if n<=max then
sum=sum+n
AddNumbers(n+1,max)
end
end

If we define sum=0 and then run AddNumbers(1,20), then first time through, it tests that 1 is less than 20, and adds 1 to sum, then runs AddNumbers again, passing through 2 and 20. Again, 2 is less than 20, so it gets added to sum, and AddNumbers gets called again. This continues until n reaches 20, when it stops “recursing”.

So how have I done it in Threes? It is a little more complex, because for each future move (ie if we have depth three, then I have 3 future moves, including the first one) I need to test all four possible moves.

There isn’t much code, so perhaps I’ll just show you. As usual, see the numbered notes underneath.

function TestMove(b,d,mov) --[1]
    for n,i in ipairs(mov) do  --[2]
        local b1=CloneBoard(b) --[3]
        local m=MoveCells(moves[i],b1) --[4]
        if #m>0 and d<MoveDepth then  --continue if valid moves>0 and we haven't reached the target depth
            TestMove(b1,d+1,m)
        else  --[6]
            local s=GetScore(b1,true)
            if s>maxTest then maxTest=s end
        end
    end
end

[1] – b is copy of board so far, d = depth, mov is list of valid moves

[2] – iterate through the moves provided

[3]–make a copy of the board passed to the function

[4]–make a move using this copy of the board

[5]–keep going (ie call TestMove again) as long as we have valid moves and haven’t reached the target depth

[6]–..otherwise get the score and test it against the best so far

Other changes

I had to make a small change to MoveCells at the end, so that if we are simulating, we don’t end the game if we run out of moves, and we return a list of valid moves.

I also added code to the end of DrawBoard, so that if the auto variable is set to true, then we are autoplaying, and it runs GetHint again, ie keeps the program working on one move after another.

One more thing

I included one more option, to increase the “score” for purposes of choosing the best move, if there are blank tiles. The reason is that you stop when you run out of blanks, so the more blanks you have, the better.

I thought about how best to do this, and decided I would multiply the normal score by 16 / non-blank tiles, so if you have 2 blanks, then you multiply the normal score by 16/14.

This is an option in the parameters, which you can try.

Finally

There are many ways to write any program, and I don’t pretend to have found the best approach. However, I hope all this has been of some small help to you.

The final code is here.

 

 

Advertisement

From → Games, Programming

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: