AlphaGOLAD Zero: Mastering the Game of Life and Death with Self-Play

Abstract

Monte Carlo Tree Search, neural network-based policy and value evaluation, and self-play are three popular techniques widely used in reinforcement learning. Inspired by the recent AlphaGoZero paper, we design, construct, train, and evaluate an agent to play the Game of Life and Death (GOLAD) using a combination of the aforementioned techniques. GOLAD is a two-player game based on Conway’s Game of Life (GOL), where players can manipulate their cells after each simulation step. We obtain positive results on a small board versus a random agent, but challenges remain in transferring our player to larger boards and playing versus more sophisticated opponents.