Hello people, in this post we will try to improve the performance of our Minimax algorithm by applying Alpha-Beta Pruning. Please read my post on Minimax algorithm if you haven’t already.
Alpha-beta pruning is based on the Branch and bound algorithm design paradigm, where we will generate uppermost and lowermost possible values to our optimal solution and using them, discard any decision which cannot possibly yield a better solution than the one we have so far.
Idea of Alpha-beta pruning
We will maintain two additional parameters α and β. α is the best score achievable by the max player so far and β is the best score achievable by the min player so far. These values will be passed down to recursion calls via arguments.
We will try to use α and β to prune our search tree by skipping choices which can’t possibly give us a better solution. Let us understand this with the help of an example. Look at the sketch below –
Let us assume we have the above game tree formed by two agents (max and min) playing a hypothetical game. Here, the Max agent tries to maximize the score and Min agent tries to minimize the score. Now, for a normal Minimax algorithm, we would traverse all these nodes, but this time we will send the values of α and β.
The topmost Min node makes the first call. Initially, the values of α and β are null. Here val is the value which will be returned. Now what will happen next? Min has two possibilities above and the call goes to the first possibility, which is the first Max node in the above diagram. It passes on the values of α and β, which both happen to be null for the moment. What will Max do there?
The choices for Max are 2 and 4. Since Max always maximizes the score, it will choose 4. So we update the value to be returned to 4. Notice that the value of α = 4. α denotes the best possibility for Max so far. It doesn’t play a big role here, but you must have an idea on when to update α and β. Now, we return value 4 from the Max node.
Hmm, now the Min node sees that the first possible decision will give it a score of 4. Okay, so now β = 4. Remember, β is the best possible decision for Min node so far. Now what does Min do? It looks at the next possibility. It passes on values of α and β. α is anyway null, but β = 4.
After reading 6, val = 6 and α = 6, because it is the best solution so far. Now, what if the values for the choices ahead returned a value lesser than 6? Then obviously Max would choose 6 since it is the highest. Okay, what if the values for the choices ahead returned a value greater than 6, say X? Then obviously Max would choose X, since it is greater than 6. But, even if it did, will it affect the decision of Min on top? No! Because between 4 and X, Min would obviously choose 4!
So no matter what the next value Max encounters, it cannot affect the decision of Min. How did Max node know Min already has a choice which yields 4? From the value of β!
Read the above case again and again if you didn’t understand it. If you’ve understood it then you’ve learned Minimax algorithm with alpha-beta pruning! 😉
So, we break further computation in Max, and return 6.
The pruning happens for these cases –
- val ≥ β in a Max node
- val ≤ α in a Min node
Now let’s try to write the pseudo-code for Minimax algorithm with alpha beta pruning. Before we do that, first try to write down the pseudo-code for a regular Minimax algorithm. If you could, that’s awesome! 😀 If not, take a look at the pseudo-code in my post on Minimax Algorithm, because I will only make slight modifications in that.
We know that pruning happens only in the above stated two cases. So the only modifications we need to make to our existing Minimax algorithm pseudo-code is that,
- Add the parameters alpha and beta to the procedure.
- Add the conditions to update alpha and beta.
- Add conditions for pruning.
The final pseudo-code goes like this –
(score, move) maxTurn(game, depth, alpha, beta): if game is in terminal state: return (score(game, depth), none) max = (none, none) foreach emptySpace in game: game[emptySpace] = X currentMove = minTurn(game, depth + 1, alpha, beta) if currentMove.score > max.score: max = (currentMove.score, emptySpace) if currentMove.score >= beta: return max if currentMove.score > alpha: alpha = currentMove.score game[emptySpace] = none // reverting change return max (score, move) minTurn(game, depth, alpha, beta): if game is in terminal state: return (score(game, depth), none) min = (none, none) foreach emptySpace in game: game[emptySpace] = O currentMove = maxTurn(game, depth + 1, alpha, beta) if currentMove.score < min.score: min = (currentMove.score, emptySpace) if currentMove.score <= alpha: return min if currentMove.score < beta: beta = currentMove.score game[emptySpace] = none // reverting change return min
So, now instead of 1 if condition, we have 3 if conditions in our methods. The order of the new conditions can be interchanged, I like to write it this way.
Now that you have the pseudo-code, can you use that to tell which nodes get pruned for the example below?
Now you are more than capable of writing the code for Minimax algorithm with alpha beta pruning. It is just a matter of a few conditions if you have already written the code for Minimax algorithm. Try to code it, you can refer to my code if you get stuck (it is a modification of the code in my post on Minimax algorithm) –
Congratulations! You AI just got hell a lot faster! Keep practicing! Happy coding! 😀