Pawn Captures

The next moves we need to consider for pawns are captures. Now we all know that pawns can only capture diagonally. In fact, they can only move diagonally if they are able to catch an enemy piece. How can we model this with a bitboard?

The white pawns bitboard looks like this. We’ll draw it laid out as a board for now.

In order to check for valid captures we need to move the pawns forwards diagonally in both directions. This is easy to achieve with bitboards again using the shift operator. Instead of shifting by 8 to move directly forwards we will shift by 7 to capture to the right and 9 to capture to the left.

But there’s a problem with this. What happens to the pawn on the A file when we capture to the left using the shift? It gets moved onto the 4th rank!

There is an easy solution to this. It is never valid for an A file pawn to capture to the left so we can just mask these out before performing the shift.

Likewise we need to the same for the capture to the right, masking out the H file pawns.

So now we’ve generated bitboards for the potential moves we need to check which ones are valid. To do this we need to mask out any move that doesn’t result in our pawn taking an opposing piece. Where do we find a bitboard of opposing pieces? We have to store it in our BOARD structure.

Like with the pawns, we store two bitboards – one for each side. These need to be updated along with pieces, all and pawns for every move on the board to keep them all in synch.

Now we simply need to bitwise ‘and’ the opposing side with our potential captures to see which ones actually result in a capture:

What is left is bitboards for capture_left and capture_right containing bits set only where a diagonal move would capture an enemy piece.

Let’s add this to our move generation function.

Bitboard: Moving Pawns

We’ve seen in the previous part how to actually move some pawns. This is fine for the opening position because we know that all the white pawns can move one or two spaces forwards – there are definitely no pieces in the way preventing this. How can we apply this later in the game though where the pawns are not all on one rank and might be blocked by other pieces?

The answer lies with more bitboards. A pawn bitboard is all very well on it’s own for generating candidate moves for the pawns but in order to validate the moves we need some more information.

Consider a white pawn sitting on C2, blocked in by white’s own knight. We’ll ignore all the other pieces for now and assume the board is empty except for these two pieces.

What does the white pawn bitboard contain?

Notice the single, solitary bit set on the second rank. There is no sign of the knight in this representation. So, if we want to generate the moves for the pawn we can start by shifting the board left by 8 to advance the pawn:

Perfect. Our pawn has moved forward a rank. But there’s a problem isn’t there? We’ve just moved the pawn onto the cell containing the knight. The bitboard has no idea the knight is there. We’ve generated a move for the pawn but not a valid move.

So how do we validate the move? We could iterate through the moved pawns bitboard and for every moved pawn we could see if a piece exists on the cell:

This isn’t bad but again requires us to spend CPU cycles iterating through the bits of the bitboard. There is a more efficient way of doing that which we will see later but, like the last time we removed the need for an iteration we can solve this with another bitboard. But what bitboard? What we need is something that can efficiently tell us where every piece on the chess board is.

Like the pawns bitboards we need to update this new bitboard every time a piece is moved. We set a bit in the bitboard if any piece resides in the relevant cell and clear it when a piece moves out of the cell. So, going back to our knight and our pawn in their original positions, the ‘all’ bitboard will contain the following:

Using a bit of bitwise magic we can move the pawn as before but then validate the move by checking it against the ‘all’ board.

How does this work? Let’s take a look at the bits.

If you perform a bitwise ‘and’ of white pawns << 8 and ~board.all you get this:



Everything is zero in the resulting bitboard indicating there are no valid moves. That’s the result we wanted.

We can take this a stage further and look at pawn double-moves too. Consider this board:

We have the same pawn/knight as before but in addition we have a pawn on the E file that can make a valid double move and a pawn on the G file that can move once but not twice. The white pawns bitboard looks like this:

The ‘all’ bitboard looks like this:

If we now calculate the shifted pawn bitboard for a single move then and it with ~board.all like before we get this:

Note there are only two pawns left thanks to the knight we looked at before. There are two valid single pawn moves. Now let’s try to move forward again by shifting by another 8 bits.

Now and this with ~board.all again and you get:

Just one pawn left – one pawn which can make a valid double move.

Let’s put this into code. We will start writing a function that calculates all the valid moves for a given position and start implementing it with the pawn moves we’ve just looked at. The function will not be complete yet as we need to do something with the bitboards of valid moves that we generated however it is a start.

What is a bitboard?

Well, I don’t know where that year went. I didn’t mean to leave it so long between the last post and this one but life happens! I’ll try not to leave it so long this time…

So now we have our implementation of move generation using an 0x88 board. It works but we’ve made no attempt to optimise it for speed. Now we have two choices:

  • Analyse the performance of the 0x88 board move generation and attempt to optimise it making it as fast as possible
  • Reimplement the move generation using a ‘bitboard’ to compare the relative speed.

Ideally we would perform both of the above so we can compare an optimised 0x88 board against an optimised bitboard but as well as looking at performance I’m interested in looking at the complexity of implementing a bitboard based move generator as well as performing the optimisation. So, we’re going to throw away the current 0x88 board code we’ve looked at and see how to write it all over again using a bitboard.

Now, I’ve read a number of descriptions of what a bitboard is and how they can be implemented along with some ideas on optimisation. I’ve deliberately steered away from looking at anyone else’s implementation though because I want to see what I can come up with on my own.

If you take the name “bitboard” literally then what have you got? An entire chess board state stored as single bits in an array? Well, that won’t work will it. If you have one bit per square then how do you know what piece is in each square? How do you know what side it is? How do you know if castling or en-passant is available? No, the idea of a bitboard is (at least as far as I’m thinking) that you use bit arrays for storing information that can be easily optimised using bit operations.

Consider the white pawns in their opening position. All eight of them on the second rank. When we want to work out the valid moves from the opening position we know the pawns can all move forward one or two places. How do we do this with an 0x88 board?

Well, here’s a naiive implementation:

The implementation literally iterates through every cell on the board looking for white pawns then stores potential moves for each pawn. OK, there are very simple ways to make this more efficient. In fact, our previous code was more efficient than this. However it will still end up in a loop for each pawn.

Now let’s look at an implementation using a bitboard.

Hmm… There are a few things to explain here. The first is the representation of the bitboard itself. The first line above defines a BITBOARD as a uint64_t. Basically, the entire board representation is stored in a 64 bit number; 64 bits for 64 cells in a chess board.

As I said previously though you cannot store then entire state of a chess board in a single bitboard; 1-bit per cell is not enough to store all the information we need. So, we’ve got a function that returns a bitboard representation of all the white pawns on the board. Where a white pawn stands a bit is set. Where a cell is empty or contains a piece that is not a white pawn the bit is clear. The board is arranged as following in the 64-bit number:

MSB is the most significant bit, LSB is the least significant bit. You can see all the bits on rank 2 are set indicating where the white pawns live. If we rearrange the above putting each block of 8 bits on top of each other then we get this:

You can see how the above matches the cells on a chessboard.

So how does this help us? Looking again at the valid moves for these pawns starting from the opening position then we know the pawns can all move forward one or two places. We need to shift all the 1’s in the board above up by one or two rows. How do we do that with a bitwise operation? We shift the bits by the number of cells in a row multiplied by the number of rows; 8 * 1 or 8 * 2.

So looking at the flat representation the shift does the following:

So it’s easy to see that shifting the board left by 8 bits moves all the pieces on the board up by a row. This is a very efficient way to generate this type of move and removes the need to iterate over the cells of the board for each piece, as long as the get_white_pawns() function does not need to perform the iteration.

That brings us onto the representation of the board itself. We’ve seen that we cannot represent the whole chess board with a single bitboard so how do we represent it efficiently? How about this?

OK, I don’t see any bitboards in there but let’s go with it. How do we implement get_white_pawns() using the above?

Oh dear. That’s just what we wanted to avoid – iterating through all the pieces in order to locate/move the white pawns. Instead of doing this, how about we store the positions of the white pawns in our BOARD structure? And while we’re at it, let’s store the black pawns too. We’ll put them in an array of 2, one for white one for black.

This means our function is simplified to this:

In fact, we can probably just lose the function altogether and just access the structure member directly.

This all looks much more efficient. However, there is one catch; We need to remember to keep the pawn bitboards consistent with the pieces array in the structure. Whenever we move a piece we need to update board.pieces[] but we also need to check if it is a white/black pawn and updated the relevant bitboard. Is this more efficient that just iterating through all the pieces in the board? We’ll see.

0x88 Perft

Now the 0x88 board move generator is complete we can check if it generates the correct moves by implementing a perft function.

The perft function counts all valid moves to a certain depth from the current board position. This count can be checked against a table of known results to see if the move generator is behaving correctly.

The perft function is split into two parts. The perft() function initialises a timer and the counter and calls the recursive perft_s() function to generate the results which it then prints along with the time taken to calculate the result.

The perft_s() function is very simple. It starts by generating all pseudo valid moves for the current board layout.

It then iterates through the moves and ensures that making the move does not result in leaving the king in check – this turns the pseudo legal moves into actual legal moves.

If we’ve reached maximum depth then we count the move at this point. If not at maximum depth then we recursively call the perft_s() function to make all the moves for the opposition with the new board layout.

To use the perft() function I just populated the main() function with some test code.

I added the run_perft() function to provide a neater output when calculating to multiple depths. The main() function defines a fen string for the starting position then calculates the perft results at multiple depths (in the case above, from perft(1) to perft(6)).

This results in the following output.perft

Checking this against the results from the chessprogramming wiki here we can see that the move generator is working correctly, at least for the initial position.

I have been through all the perft positions in the wiki and the move generator creates the correct results for each one.

That concludes the 0x88 board move generator. Next we will start looking at a bitboard implementation to see how much of a speed improvement we can achieve.


Move Generation: Castling

Let’s finish this off so we can get to the more interesting bitboard code and some optimisation.

The last piece of the move puzzle is castling. Here is the code.

This basically checks for availability of castling by looking at the castling_availability flags in the board structure then checks to see if the conditions are right for castling.

The conditions that are checked are:

  1. Ensure the squares that the king and rook pass through are empty
  2. Ensure that the king is not in check in any of the squares it passes through or ends up in

In order to check that the king is not in check in any of the squares we call the is_cell_range_attacked() function.

This works by making a dummy move to each of the squares being checked then generates moves for all enemy pieces to see if they attack the piece on the square being checked.

That concludes the move generation code for the 0x88 board. Next time we’ll throw together a simple perft function to check that the move generation code is working correctly and give us an idea of the performance.

Move Generation: Other Pieces

Last time we looked at the pawn moves in detail which resulted in a fair amount of code due to all the special cases required by pawn moves (en-passant, double moves, can only capture diagonally). This time we will cover the moves of all the other pieces.

With the exception of castling, none of the other pieces have any ‘special’ moves or rules so we can handle them by using an offset table. The table contains a set of values for each piece, one for each possible direction the piece can move. To move the piece in a particular direction by one step you just add the value to the current cell location.

For example, looking at the values for the rook offset table:

This is a bit more obvious if we show the values for each direction on a chess board where CELLS_PER_RANK is 16 for an 0x88 board:


So, to go up 1 rank (go North) we add 16 to the current cell. To go down 1 rank we add -16 to the current cell.

This works well for the king as we can specify the offsets for all 8 directions it can move and that fully describes the possible locations for the king. But how do we handle sliding pieces that move until they bump into another piece or the edge of the board? Simple – we just keep adding the offset for the current direction in a loop until we reach a stopping condition:

Putting this all together we can write the get_valid_moves_table() function that handles the moves for all non-pawn pieces, sliding and otherwise.

The function is called with the cell_index containing the piece to find the valid moves for, a pointer to the relevant piece offset table and a flag that specifies whether the pieces is sliding (single == FALSE) or not (single == TRUE).

Looking back at the get_valid_moves() function we can see how we have now defined the moves for all the pieces with the exception of castling which we will cover next time.


Move Generation: pawns

Now we have the ability to make and unmake moves it’s time to generate the list of valid moves from the current board position.

For this implementation, like the rest of the code so far, I’m going for simplicity. Therefore this will be a pseudo-valid move generator. It will quite happily generate moves that will leave the king in check. We’ll need to check for this when making the move and filter the invalid moves out at this stage.

The valid moves are going to be stored in an array of CHESS_MOVE structures which we defined in the last entry as:

For each board position there is a maximum number of moves that can be played. So instead of dynamically allocating and reallocating the move list as we add valid moves, I just pass an array that is guaranteed to be big enough.

So, how big should it be? Well, let’s imagine that every pawn has been promoted to a queen and can move to the maximum amount of squares a queen can move to (impossible, I know but it doesn’t matter if we over estimate). That’s 8 * 28. Add that to the maximum moves available to each other piece (Rook : 14, Knight 8, Bishop : 14, King : 8, Queen : 28) and you end up with 332. We can define this in our code:

Now we can just generate an array containing this many CHESS_MOVE instances and we can guarantee we will not overflow it. We haven’t taken promotions, en-passant captures or castling in our maximum move count but we have easily over-estimated so we will not be overflowing our array.

So now we have our array, we can pass it to the function that generates the moves, along with an integer that is initialised to zero that keeps track of how many valid moves we’ve added to the array.

This function looks for all valid moves from a single cell. To generate all valid moves for a position the caller needs to iterate through all cells populated by pieces of the side to move. Each piece type is handled by calling a function that generates the moves for the individual piece.

Let’s take a look at how the pawn moves are generated.

This is fairly self explanatory albeit a bit long winded. There are a couple of things that maybe require further explanation.

Firstly, if the destination rank is 2 (for white) or 5 (for black) after the pawn has moved then it is an initial pawn move. Therefore we add an extra move advancing the pawn one step further if the cell is empty. This handles the initial pawn double move.

Secondly when a pawn advances or captures we check whether we have reached the end rank. If so we add the move multiple times to represent the possible pawn promotions.

Lastly we check to see if the pawn can move diagonally to the board en-passant cell and if so we add a move to represent this.

Note the use of the macros to populate the moves. This is just to save some typing. Here they are:

Next time we’ll take a look at the moves for all the other pieces.

Making a Move

Now we have our board representation we need a way to make and unmake moves. As before, this is going to be a very naive implementation that is more about readability than efficiency.

We’ll start with the interface. We need two functions. One for making a move and one for undoing the last move.

So, what do we store in the move structure?

Not a lot! We just need to keep track for the cell a piece is moving from and where it is going to along with the piece a pawn that reaches the end should be promoted to. One important thing to note here is that the move/unmove code assumes that all moves are valid; it performs no validity checking whatsoever.

So let’s start to implement the move function. The most obvious thing we have to do is to move the piece. Recall from the last post that we created a global variable holding the board state – the move function will directly update this state.

If only it was that simple! We have a number of other things to consider when moving pieces:

  • Capturing en-passant
  • Castling
  • Promotion

Let’s deal with these in order.


The move function needs to perform three functions related to en-passant:

  1. For any other move the current en-passant cell needs to be cleared.
  2. If a pawn is advanced two places then the board structure needs to be updated to keep track of the en-passant cell
  3. If a pawn moves to the current en-passant cell then the enemy pawn needs to be removed from it’s cell

Here are 1 and 2.

And 3.


Castling is another special case with two functions.

  1. If the board indicates castling is available and the king is moved to a castling destination then the rook needs to also be moved
  2. Moving the king or either rook removes castling ability

This gets a bit messy (and could easily be improved – remember this is not the final code…)

Yuck! Magic numbers. I’ll fix them later. This is a lot of code but it’s fairly straightforward. If the king is being moved from it’s home position to a castling location then move the rook as well. We don’t need to check for castling availability because we assume that the move being passed to us is valid.

Removing castling availability is straightforward.

More magic numbers… We don’t need to check whether the piece we are moving is the king or a rook as if it is not then the castling availability will have already been removed so we will not be changing any state.

Pawn Promotion

Finally we have pawn promotion. This one is easy.

That completes the move function. So how do we undo a move? In the interests of keeping this simple we are going to add a new field to the CHESS_MOVE structure.

We’ve added the ‘board’ member. Before we make a move we will copy the entire board state into this member then we can just copy it back when we undo the move. So, at the start of the move function we add this:

This leaves us with the simple task of implementing the unmove function:

Well, that’s it for this instalment. Next we’ll start taking a look at the code required to generate all the legal moves from a starting position.

0x88 Board

In this post we’ll start looking at my reference chess board implementation. Remember, I want to keep this simple and maybe not spend too much time thinking about optimisation or even optimisable code. This is just a learning experience that will ease the future development of a much more efficient version.

The simplest implementation would seem to be just to define a structure that describes the contents of a square (I call them ‘cells’ in my code) and then define an array of 8×8 of these or simply:

These could be arrange any way you please but it makes sense that cells[0] is A1, cells [1] is A2, cells[8] is B1, etc.

However, a slightly larger, more complicated board array actually makes a lot of the chess rules engine code much easier. Therefore for this implementation I’m going to be using the common 0x88 board format. Much has been written about this format so I’m not going to duplicate it in my words here but basically the board is laid out as in the diagram below:


What you’ll notices is that the indices for the cells follow a simple pattern; all indices for a particular file share the same low nibble and all indices for a particular rank share the same high nibble. Further, the top bit of each nibble is clear for all valid cells so we can easily check whether an index is valid using the following C code:

So now we can declare some structures that represent the data in a cell and create the board array.


Now we have our board structure we can start writing some functions.

Starting with a simple one just to clear all pieces from the board:

And here’s a useful function for debugging that prints an ASCII representation of the board:

Notice that the function above uses a macro ‘CELL_INDEX’ to convert from file and rank to an index. I’ve generated a number of useful macros like this:

And finally for now, a very scrappy, partial implementation of a FEN string parser just so I’m able to populate the board.

Putting all this together with a simple main() function containing the following code results in an 0x88 board that can be printed to the screen.


The Plan

I think we’re ready to start putting some chess into the Ircos project :) .  I have a plan…

I’m going to try to make this chess engine project an exercise of optimisation. I’m going to try not to rush into getting a fully working chess engine as soon as possible and instead I’m going to concentrate on each stage trying to make it as fast as possible. With that said, I’m going to start with the rules of chess and the chess board implementation.

As I’ve never written a chess engine before, I can assume that I’m going to make design mistakes :) . Therefore I’m going to write a reference chess board implementation first of all which will serve a few purposes:

  1. I can make all the mistakes in this implementation and learn how and how not to write a board implementation
  2. I can throw-away the code at the end and use my newly acquired knowledge to write a better, more efficient version
  3. I can use the reference implementation to verify the validity of any subsequent (likely more complicated) board implementation
  4. The reference implementation can be used as a benchmark to show how well the optimisation is progressing

So, for the reference implementation I want to keep it as simple as possible so it can be developed quickly and will minimise the chance of bugs creeping in.

In the next instalment we will take a look at this reference implementation.