About Ultimate Tic Tac Toe
Learn more about the game on Wikipedia.
How to Play
Objective: Win three small boards in a row (horizontally, vertically, or diagonally) on the large 3x3 meta-board.
Setup: The game is played on a 3x3 grid of smaller 3x3 Tic Tac Toe boards, creating 81 total cells.
Gameplay:
- The first player can place their mark (X) in any cell on any of the nine small boards.
- After the first move, each player's move determines which small board the opponent must play in next.
- The position of your mark within a small board corresponds to which small board your opponent must play in. For example, if you play in the top-right cell of any small board, your opponent must play somewhere in the top-right small board.
- If a player is sent to a small board that is already won or full (tied), she may play in any open cell on any available board.
- To win a small board, get three of your marks in a row within that board (just like regular Tic Tac Toe).
- Once a small board is won, it counts as that player's mark on the meta-board. The cells in a won board can no longer be played in.
- The game ends when a player wins three small boards in a row on the meta-board, or when all small boards are decided (resulting in a tie if neither player has three in a row).
Notes on the AI
Board Representation: Internally, the game state is represented using a bitboard consisting of two u128 integers. This compact representation allows for extremely fast move generation and game state evaluation.
Search Algorithm: The AI uses Monte Carlo Tree Search (MCTS) to find good moves. MCTS works by running many random simulations from the current position to the end of the game. The algorithm builds a search tree incrementally, balancing exploration of new moves with exploitation of moves that have performed well in previous simulations.
Unlike traditional minimax algorithms, MCTS does not require a hand-crafted evaluation function. This makes it particularly well-suited for Ultimate Tic Tac Toe, where the branching factor is high and positional evaluation is complex.
Client-Side Execution: All simulations run entirely in your browser using WebAssembly. This means if you have a faster machine, more simulations will be run in the allotted time, resulting in a stronger AI opponent.
Endgame Minimax: When the game gets sufficiently close to the end, the AI switches from MCTS to minimax mode. In this mode, no simulations are run - instead, the AI calculates the optimal move directly. If the AI can win, it will play the move that guarantees victory in the fewest number of steps. If the AI knows it's in a losing position, it will play the move that delays the loss as long as possible.
I've found that simply using a minimax search with alpha beta pruning only works when the game is rather close to the end. Minimax can only search about 16-20 moves in a minute. So if the AI was using minimax from the start, it would be horribly slow, and be making essentially random moves anyway until almost the end of the game.