A Scala 3 framework for game-playing tree search and reinforcement learning.
Gambit provides a generic, typeclass-driven infrastructure for game-playing strategies. It supports minimax with alpha-beta pruning, Monte Carlo Tree Search (MCTS), and reinforcement learning (MENACE), with fully worked implementations for TicTacToe and Connect Four.
The framework depends on Visitor for its graph and tree traversal engine.
Gambit is published to Maven Central. Add the following to your build.sbt:
libraryDependencies += "com.phasmidsoftware" %% "gambit" % "1.1.0"Requires Scala 3. Depends on Visitor 1.6.0, which is pulled in automatically as a transitive dependency.
State[P, S]— typeclass for a game state: legal moves, goal detection, heuristic evaluation, andisFirstPlayerToMoveGame[S, M, Pl]— typeclass for game mechanics:start,moves,applyMove,nextPlayer,currentPlayer,winnerPlayer[S, M, Pl]— trait for player strategy:chooseMove,gameOverGameRunner[P, S, M, Pl]— generic game execution driven byStateandGamegivensGameResult[Pl]/MatchResult[Pl]— per-game and per-match result typesAlphaBetaPlayer[P, S, M, Pl]— generic minimax player with alpha-beta pruning and move ordering; configurable depthMCTSPlayer[P, S, M, Pl]— generic Monte Carlo Tree Search player with UCB1 and tree reuse between movesTournament[P, S, M, Pl]/Contestant[S, M, Pl]— round-robin league with 3-1-0 scoring and a formatted league table
TicTacToe/Board— compact bitboard representation with rotation, transposition, and a rich heuristic hierarchyTicTacToeOps— low-level Java bit manipulation (rotate, transpose, play, render)- Six player types:
RandomPlayer,HeuristicPlayer,MenacePlayer,PerfectPlayer,AlphaBetaPlayer, andMCTSPlayer given tictactoeGame— wires TicTacToe into the genericGametypeclassTicTacToeDemo—@mainprogram playing home/away matches between all player types
A reinforcement learning player based on Donald Michie's 1960 matchbox machine.
Each board position is a Matchbox of weighted beads per legal move; beads are
added after wins and removed after losses.
Matchboxes— registry with D4 symmetry reduction (~8× learning speedup)MenacePlayer— records move history and back-propagates results
Builds a complete minimax score map over the TicTacToe game DAG on first use via
Visitor's Traversal.dfs(DfsOrder.Post). Each of the ~5,000 reachable positions
is evaluated exactly once (DAG, not tree). Never loses; always draws against itself.
Connect4— column-major bitboard with gravity, sentinel-bit win detectionConnect4State—State[Connect4, Connect4]with window-scoring heuristic- Four player types:
RandomPlayer,HeuristicPlayer,AlphaBetaPlayer, andMCTSPlayer given connect4Game— wires Connect Four into the genericGametypeclassConnect4Tournament— round-robin tournament between all player typesConnect4Demo—@mainprogram playing home/away matches between player types
Run from sbt with sbt run and choose TicTacToeDemo or Connect4Demo. Each
program plays home-and-away matches between all player type pairs, printing the
board state and heuristic score after each move, e.g.:
HOME: Perfect (X) vs MCTS(500) (O)
==================================================
Move 1 (X plays cell 4 — row 1, col 1):
...
.X.
...
[heuristic=7.0]
...
Result: X (Perfect) wins!
Run the Connect Four round-robin tournament from sbt:
sbt "runMain com.phasmidsoftware.gambit.examples.connect4.Connect4Tournament [gamesPerPairing] [seed]"
Both arguments are optional. gamesPerPairing defaults to 6; seed defaults to
System.currentTimeMillis() so repeated runs produce different results unless a
seed is specified. Sample output:
Connect Four Tournament (6 games per pairing)
Player P W D L GD Pts
--------------------------------------------------
1. AlphaBeta(d=6) 60 48 1 11 +37 145
2. AlphaBeta(d=4) 60 45 1 14 +31 136
3. MCTS(i=500) 60 37 2 21 +16 113
4. MCTS(i=200) 60 31 1 28 +3 94
5. Heuristic 60 16 1 43 -27 49
6. Random 60 0 0 60 -60 0
Full Scaladoc at: https://rchillyard.github.io/Gambit/latest/api/
To regenerate and publish:
sbt ghpagesPushSite
GamePlayingDesign.md— design of the game-playing framework: typeclasses, alpha-beta, MCTS with tree reuse, MENACE, Connect Four tournament, heuristic conventions, and future directions
- Actor-based parallel rollouts for MCTS (Akka/Pekko)
- Heuristic rollouts for stronger MCTS play
- MENACE self-play, X/O symmetry, and persistence
- Chess —
given chessGame: Game[ChessState, ChessMove, Boolean] - Bridge —
given bridgeGame: Game[BridgeState, Card, Seat]; four-player; MCTS with determinization for hidden information