Making Chess in Pico-8 - #3: Chess Engine Basics

Share this post:

Share on Bluesky Share on Twitter Share on Facebook

I'm Krystian. I make games and teach game design. I recently wrote a Chess Engine from scratch in Pico-8. In this series of posts I will walk you through the process and share with you some resources I relied on.

Chess Engine Basics

Writing a Chess AI is a good opportunity to take a plunge into the Chess Engine community. People have been writing Chess Engines pretty much since the dawn of the computer age. One could even claim a Chess Engine was the first computer game ever made. So over the decades a wealth of knowledge and best practices has been accumulated. A good starting point to take advantage of this would be the Chess Programming Wiki. However, I found that particular page quite daunting. A newcomer like me was quickly overwhelmed with technical jargon. And while I was able to pick up individual insights, I was struggling to put them together into a cohesive image. Does it even make sense to implement Alpha-Beta Pruning in my case or is this something for high-end engines? Which of the many board representations is the best one for the restrictions of Pico-8? It's hard to answer these questions without ever seeing how a Chess Engine comes together.

I decided to look for more guided sources on this topic and one that has been immensely helpful was Bluefever Software with two tutorials on how to write Chess Engines in C and Javascript respectively:

Following these tutorials was immensely helpful for me. I would like to take the rest of this article to walk you trough some of the insights I had applying those lessons to Pico Checkmate.

Board Representation

The programming of any strategy game begins with considerations of the map structure or board representation as it's called in Chess Engines. Before we can even think about programming any of the gameplay, we must come up with a way to store all the information about any given game state in the computer's memory. We need to consider which information a given game state consists of. We also need to consider how we want to access all that information later on.

I guess Chess Engines are bit of a special case here. I wrote my own strategy games in the past. Because the rules of those games were often still in flux as I was writing them, the approach on how to implement a map structure differed dramatically from how Chess Engines do board representations. My games needed to have a flexible map structure so I can easily add and remove rules and game mechanics as I go. The rules of Chess are fixed and it pays off to optimize for them.

Chess is a tile-based game. It is played on an 8x8 board. Tile-based games usually use some kind of system to store information about the various tiles. Pico-8 itself has a built-in tile-based map system. You would could read a sprite number associated with a given tile like this:

mget( x, y )

The variables x and y indicate the coordinates of the tile in question. In a Chess game these would be numbers between 1 and 8. This is similar to how Chess notation identifies individual squares. It's a straight-forward approach and I used similar systems for most of my games. So I was surprised to see most Chess Engines going for a simple one-dimensional array like this

board( pos )

Instead a pair of coordinates, all of the squares would be sequentially numbered and each square would be only addressed by a single number like this:

Taking this approach has a significant impact on the complexity of most of your functions. For example, in order to loop trough all squares on the board, you can just use a singe loop instead of two.

-- Coordinates
for x=1,8 do
 for y=1,0 do
  board( x, y )
 end
end
-- One-dimensional
for x=0,63 do
 board( x )
end

Functions require fewer arguments. There are fewer variables. Also, the operations can be possibly performed a tiny bit faster, which is a big deal if some operations are called hundreds of thousands of times.

In some cases it can be necessary to translate from a tile number to an actual coordinate and vice-versa. For example, the user might click on a square and you might have to figure out which number square it is based on the mouse coordinates. But that's trivial and it turned out to be astonishingly rare.

Another disadvantage of the one-dimensional approach is that it may be difficult to catch if a piece is about to leave the board. Here, Chess Engines like to use another, quite ingenious trick. You see, Chess Engines actually work with a little bigger board that looks like this:

As you can see, they add some additional squares on the margins of the actual Chess board. These off-board squares are set to be "occupied". So whenever a piece is moved, you do not perform any laborious checks if it would leave the board. Instead you just roll with it and simply make sure if the destination square is not "occupied" - which includes off-board squares. The additional padding on the top and bottom is because a Knight is able to jump 2 squares deep.

With the board set up, moving a piece requires adding a single number to its position. The following is an array that describes how ALL of the Chess pieces can move - by simply listing numbers that are added or subtracted when they move.

poffset={
 -- Pawns are special
 {},
 -- Knight
 { -21, -19, -12,  -8,  8,  12,  19,  21},
 -- Bishop
 { -11,  -9,   9,  11},
 -- Roook
 { -10,  -1,   1,  10},
 -- Queen
 { -11, -10,  -9,  -1,  1,   9,  10,  11},
 -- King
 { -11, -10,  -9,  -1,  1,   9,  10,  11}
}

There is a little extra work required for the Bishop, Rook and Queen since they can slide move multiple times in a direction. Also, Pawns are special since they move and attack differently. But as you can see, this is the groundwork for an extremely compact and efficient way to manipulate the board.

Makemove, Takemove and Minimax

So far we've been discussing ways to manipulate the board state. Now let's move on to the actual Chess Engine. The basic idea behind classical Chess Engines is a Minimax algorithm. First, the Engine will list all possible moves and try all of them. It will look at the resulting board state and make a judgment of how desirable that board state is - expressed by a point score. Picking the best move is then a matter of picking the move with a highest score. However, that usually doesn't result in a smart engine. Sometimes a given position may look good superficially, but then it turns out it allows the opponent to make a devastating counter-move. Capturing a Queen may be great but it isn't that great if it means you'll get checkmated in return. So in a second step, it's a good idea for the engine to go deeper - the Chess Engine will not only consider the various board states, it will also look at what moves would be available to the opponent in response. It will play from the perspective of the opponent and will try to come up with the best counter-move. The score of a given move becomes the least-powerful resulting counter-move. But even this doesn't necessarily result in a very smart AI. And as the Inception meme goes...

A modern Chess Engine will typically be able to repeat this type of search multiple levels deep - essentially "thinking" all of its choices multiple moves down the road. The only really limiting factor is the processing power. Depending on the board state, the number of available moves at any stage can be huge, pushing the total number of positions to consider into the hundreds of thousands, even millions very quickly. If this was all too dry, here is a good video explaining the process in perhaps a more intuitive way:

The actual implementation of this algorithm is an interesting programming challenge. The description suggests using a recursive approach - creating a function that calls itself. You check all moves of a board state by listing all of the moves, creating the resulting board states and then running that same function on the those board states. Which is all good and well, but I kept worrying that I would run into memory issues. After all, if we are calling the same function hundreds of thousands of times and each instance of the function has to remember its own board state, wouldn't we be out of memory in no time? Especially on Pico-8?

Chess Engines do a little trick here which I found fascinating. Instead of each instance of the search function manipulating its own copy of the board, the entire Chess Engine uses only one board for all of the instances of the search function. The individual instances just have to make sure to undo their moves after they're done with it. As such, at the heart of a Chess Engines are two functions. One makes a given move. The other takes a given move back. Makemove and Takemove.

We arrive at a search function which looks more or less like this:

function search (depth) {
    if depth == 0 then
       return boardscore()
    end
    local moves = generatemoves()
    local bestscore = -32000
    for move in all(moves) do
       makemove(move)
       local score = -search(depth-1)
       takemove(move)
       if score > bestscore then
           bestscore = score   
       end
    end
    return bestscore
end

So in plain English, we define a function called "search". This function keeps track of how deep we want to search. If a function ever finds itself being called with depth 0, it simply evaluates the board. Otherwise, it will list all of the available moves. It will then go down the list of the moves. It makes a move. It then wants to find out how good that move is. So it recursively calls itself on the resulting board state - with depth reduced by one. Immediately afterwards it takes back the move. It continues going down the list remembering the best result. At the end, it returns that result.

This specific variant of Minimax is called Negamax. It lends itself well for Pico-8 due to its compact form. Note how the function negates the score of its recursive call before processing it. That's because a score is calculated from the white player's perspective. A very positive score is good for the white player. A very negative score is good for the black player. As we search down the decision tree, we alternate between picking the best score for white and the best score for black. Which is why we have to flip the sign back and forth.

Also, note how all the board manipulation is done by other functions. Those function all manipulate a single, global board state. There is just one chess board in our program and all of the hundreds of thousands of instances of this function have to share it. This is made possible by the makemove and takemove functions. Each function leaves the board in the same state it started out in.

Evaluation

So at this point you might be wondering how the boardscore() function works. How does the AI judge how good a given board state is? The main idea is to count the material. You assign different values to the various pieces and count how many of them are on the board. Chess players already do this. They use the unit of Centipawns to judge a position.

A Pawn is worth 100 Centipawns.
A Knight is 300 Centipawns
A Bishop is 300 Centipawns
A Rook is 500 Centipawns
A Queen is 900 Centipawns

But of course, this is only useful if we are comparing positions with different amount of pieces. What if two positions have the same amount of pieces? Well, Chess Engines also use tables to slightly tweak a piece's Centipawn value depending on its position. A Knight in a corner might lose 50 Centipawns because that position restricts the Knight's movement option and therefore its usefulness. Here is a Piece-Square Table for a Knight from the Chess Programming Wiki.

Note that this is a very subjective part of the Chess Engine. As you've perhaps seen in the article on Centipawns, there are different schools of thought on the value of the various Chess pieces. The Piece-Square Tables are even more subjective and different developers might tweak those tables to make their Chess Engines play with different styles.

Additionally, there are a lot custom tweaks possible at this point. Some Chess Engines use different Piece-Square Tables depending on the stage of the game. Some Chess Engines give the first Bishop a higher value as a Bishop pair may be considered more valuable. This is the area where Chess Engine developers can go wild to test out their theories.

Chess by Any Other Name

And this ties neatly into one of the realizations I had when researching this topic. Currently, even desktop Chess Engines can easily out-perform human professional Chess players. This is a relatively new development as I still vividly remember Kasparov playing against Deep Blue. While that match was still pretty close, the computers silently gained the upper hand in the meantime. This may sound depressing at fist glance - Human ingenuity is becoming obsolete against the brutal processing power of algorithms. But we should be wary of that kind of luddite fatalism. In fact, the development of Chess Engines can be considered a different way to play Chess. As you can see above, the development of Chess Engine often requires a solid understanding of high-level Chess theory. Meanwhile Chess Engines have become excellent tools for professional players to study strategies. Just as CGI didn't "solve" Filmmaking, neither do Chess Engines ruin the game of Chess for humans.

This wraps up the basics of a simple Chess AIs. Of course a Chess Engine like this one is neither efficient nor smart. So in the next installment of this article series I will discuss some advanced techniques on how to squeeze additional performance out of our little Chess Engine. Until then, feel free to continue the discussion in the comments section.