Gomoku's winning strategy

Gomoku's winning strategy - 35 moves

When playing according to standard Gomoku rules, Black needs no more than 35 moves to win. The article presents to your attention the complete winning strategy and the corresponding game algorithm.

Demonstration of the complete solution - here - you can play and find the longest options. The program always wins and spends no more than 35 moves on it. The source code of the application, the decision itself and examples of parties at the end of the article.

I will not dwell on the rules and tactics of the game. The topic was discussed in detail at habr here , as well as decision algorithms here and here .

Small retreat

Before the era of smartphones, tic / tac-toe "five in a row" (Gomoku, Renju) was one of the most popular time killers in school. It was more interesting to consider combinations of the development of the national economy of North Africa or the classification of clover flowers.

In the fall of 1985, the girls from our 10 "b" were taken from a math lesson. We, the remaining six children, were most likely to have informal communication with a mathematics teacher on abstract topics. The teacher entered the class in silence, handed out a piece of paper to everyone in the cell and began writing the names of those present on the board. We depressed, was planned independent work or a quiz. But the list on the board turned into a standings and the championship rules were announced to us. Each with each series of five games. The prize for the winner is the top five in the magazine. According to the results of the tournament I was lucky to win, but the game did not end there. The teacher promised to put the five to all the guys if the winner wins him in a row all five games in the series. The right of the first move is given to the winner. Contrary to our teacher's statement that, on such a condition, if you play correctly, you can win both 10 and 100, and in general any number of games in a row, victory seemed to me an incredible luck.

9 years later, in 1994, Dr. Lewis Victor Allis stated that he had proof of this hypothesis in an article by Go-Moku and Threat-Space Search . The author did not publish his winning strategy, which allows him to verify the evidence. However, in his 1996 book, Searching for Solutions in Games and Artificial Intelligence , he gave a general description of the algorithms. In conclusion, the procedure for checking the completeness of a winning strategy is specifically mentioned, which is based on the correctness of the implementation of the “sequence of threats” search algorithm and the analysis of the opponent’s counterplay options.

The examples of solutions given in the article and in the book with the “correct moves” of rivals, which are part of a winning strategy, demonstrate the weakness of the algorithm used.

For example, in the figure, the solution of the program for the standard Gomoku rules. If White responds to the 10th move with g9, Black responds with j10 and then j8, then the game ends with 29 moves instead of 45. Next, the program “did not notice” the combination of Black’s “sequence of threats” twice in 17 moves after the 16th and after 26- White's move. And in the end, if White makes the 36th move f12 instead of j12, then they will hold on at least until the 49th move. To be fair, in this example, all Black’s moves do not leave White a single chance to complete the game in his favor.

On the Internet, I found several references to similar work of finding a winning strategy. Unresolved is the question of the optimality of the solutions found. What is the minimum number of moves Black needs to win?

So, having some free time, modern computing resources and paying tribute to children's hobbies 33 years after the memorable school championship, I set the task of finding the optimal winning strategy for Black in Gomoku.

Digitizing position on board

The recording of the game is quite primitive. On the field only 225 cells. Accordingly, each cell is encoded by 1 byte. To record a batch of 35 moves, only 35 bytes are required. But such a record is poorly suited for evaluating a position due to two reasons: the same position can be obtained in a different sequence of moves and the symmetric positions are not taken into account.

Achieving the goal of the game - building five stones in a row - can be done in one of four directions: vertically, horizontally, and in two diagonals. Thus, we can represent any position as a set of lines. Horizontal and vertical lines are 15 cells long and diagonal lines are from 1 to 15 cells long. Each turn changes the value of 4 lines at once in different directions.

The task of assessing the position is to determine all significant figures for each line. For simplicity, each cell line is described by 2 bits. The first bit is filled when a white stone is set, the second bit is a black stone. Each line contains no more than 15 cells and is encoded in a 32-bit integer. Thus, the search for figures on the line is reduced to a comparison of the numerical value of the line through a sliding window with a bit pattern of the figure.

In the example shown in the figure, the position is described by 26 lines. Respectively encoded with 104 bytes, while the normal batch entry requires only 17 bytes.
It is not difficult to guess that all symmetries - turns and mirror images - are obtained by simply changing the number (shuffling) and the direction of the lines. To identify the position and quickly search collections, this principle implements a 32-bit hash function, which gives different values only for asymmetric positions.

The use of symmetries significantly reduces the number of considered positions. For example, the number of options for the second move is reduced from 224 to 35.

When searching for solutions and combinations (this will be discussed below), the calculated positions make up the vertices of the multi-layer graph. Vertices are grouped into layers according to the number of filled cells. The moves make up the edges of the graph connecting the vertices of the adjacent layers. When searching for failed moves are discarded, the edges are removed and part of the vertices loses connectivity with the main branch. Therefore, after the calculation steps, garbage collection is performed (or the graph is rebuilt from the vertex).

In the development process, several coding algorithms were considered, but the one described above showed the highest position estimation speed.

We evaluate the position

An important factor in assessing the position is how significant figures are built rivals.

Five - if such a figure is found on the board, the game is over. For the standard Gomoku rules, no six, seven, etc. give a win. Therefore, the five, as, indeed, all other figures, requires the absence of their stones on the adjacent cells in the line.

The open four is 6 cells long, the middle four are occupied by stones of the same color, the outer ones are necessarily empty. Well, as for the five, your stones are absent in the neighboring cells. A very strong figure means winning even on someone else’s turn.

Four - the length of 5 cells, one (any) of the five cells free. Gives a win on his turn. It creates a threat and compels the opponent to make a move in a free cell if he does not have his own four. Gives 5 points in the position rating under protection.

An open triple is 6 or 7 cells long, the outermost cells are necessarily free. For 6 cells, three out of four medium ones are occupied by stones of the same color, one is free. For 7 cells, three medium ones are occupied by stones of the same color. The figure in its turn becomes an open four, if the opponent does not have a four or an open three. On someone else's course, it creates a threat and forces the opponent to close the top three or set up his four in response. The 6th cell three has 1 upward stroke and 3 closing ones. The 7th cell three has 2 upward moves and only 2 closing ones. Gives from 2 to 4 points in the ranking position.

Three , or three closed - the length of 5 cells, any three of which are occupied by stones of the same color. A triple on his turn can be turned into a quadruple and is used during attack and defense, creating a threat more than the open three of the opponent. Gives 1 point in the ranking position.

An open (perspective) two - from 6 to 7 cells long. When attacking, it is transformed into an open three. Gives 1 or 2 points in the position rating.

Fork - at the same time two or more threats that can not be closed in one move. There are 3x3 forks (two open triples), 3x4 (open three and four) and 4x4 (two open fours. Forks give a win if the opponent has no greater threat - a four or an open three for a 3x3 plug, or the opponent cannot close the fork consistently, creating big threats - a sequence of fours for a 3x3 plug.

A combination is a continuous sequence of threats and defenses against more significant threats from an opponent, leading to a positive result for the player. Combinations are attacking (or winning) and defensive.

An attacking or winning combination is successful if any defensive or attacking moves of the opponent have been answered in return moves leading to a win. The attacking combination ends with the installation of a fork, which the opponent cannot close.

The protective combination, on the contrary, is successful when the opponent ceases to create threats, or the limit of moves for calculation is exceeded. The protective combination consists of moves to protect or create a greater threat to the opponent.

When evaluating a position, a winning combination is searched. If successful, we won. Otherwise, if there are no threats from the opponent - the state is neutral. If there are threats from an opponent, we search for a protective combination. If successful, the state is neutral, in case of failure, we lost.

Since the number of options for attacking and forced retaliatory moves is rather limited, it is permissible to search for combinations to a sufficiently large depth. During the initial construction of the optimal strategy, the allowed depth of the search for combinations was set at more than 25 moves. When recalculating the solution for the implementation of the javascript position estimation algorithm, the allowable search depth was reduced to 17 moves.
When calculating the optimal strategy, the depth of search for a winning combination from above was additionally limited by the target maximum number of moves.

We are looking for a solution

So, we rated the given position as neutral and choose what the next move will be. Our behavior in this case depends on whether we are attacking or defending. For the attacking side, the complete solution will be a sequence of moves in which for any opponent’s return move the position is evaluated as winning (a winning combination is found) or contains the following own move in the solution. It is worth noting that to calculate the optimal strategy, the attacking side is always black, the defensive side is white.

The attacking side needs to find only one move, leading to the fastest victory. In the conditions of lack of resources, the attacking side artificially limits the number of options for busting, investigating first of all the moves leading to the position with the highest rating score. After finding any solutions, in the direction of the longest of them, the attacker expands the range of options, exploring less rated positions in order to achieve a reduction in the length of the decision.

It is enough for the defending party to find one single move that does not lead to the victory of the opponent within a given limit of moves. All free cells can be used for enumeration.
To reduce the number of protection moves, we use the “skip move” algorithm. Skip the defensive move and look for a winning offensive combination. If successful, possible moves of defense may be limited to moves that affect the success of the combination found. On average, at each step, this allows you to reduce the search area by 4-6 times. It is necessary to take into account that among the ignored moves there may be longer branches of the solution. Therefore, to find the best solutions, we use the “skip move” algorithm only during the primary search.

Calculate the strategy

All components are ready, we put the first black stone in the center of the field, launch a search for a solution and ... After a few hours, my laptop’s resources run out and we have to admit defeat “in battle, but not in battle”.

To tell the truth, I had computational power at my disposal with a hundred and fifty Xeon cores and a terabyte of free RAM. But, dented that in the mid-nineties, Allis had only 10 SUN SPARCstation 2 each with 128 MB of RAM, felt remorse for unsportsmanlike behavior and decided to limit the amount of RAM on the java-machine to 1GB and allocated only 1 stream for the task processor. It could somehow compensate for my GHz compared to its MHz. Plus, he promised himself at the end of his work to transfer the algorithms to javascript under the browser.

So, the search for strategies needed to begin with the solution of debut etudes. A detailed description of the debuts for the game Renju in Russian can be found in the well-known books of the Sagar “From the debut to the middlegame” and Mikhail Kozhin and Alexander Nosovsky “Ringing of the stones” here . The books are already 20 years old, but since then there has been little such literature. The more recent collection of Dmitry Yepifanov "Tiger in a cage" of 2015, unfortunately, is not available in electronic form.

The search for optimal solutions for the debut was carried out according to the following algorithm. At the first step, a preliminary calculation was carried out without limiting the length of the lot. Then, the longest solutions were optimized: replacing the found combinations with shorter solutions for the final steps and searching for shorter branches of solutions for all intermediate moves. Optimization was performed before reaching the target limit for all branches of the solution. Then the target limit was decreased and an attempt was made to optimize to the new value.

With the 3rd vertical debut shown in the figure, there were no problems. It turned out a complete set of solutions. The most difficult positions after the 4th move i8 and j10 as a result were solved in 31 moves. Then the target limit was set at 35 moves per game.

From the diagonal for the decision traditionally chose the 7th debut. The most difficult position occurs after the 4th move of g9. Solutions of admissible length were found for 6 moves g8 and g7.

But for this variant of the 6th move on j9 I could not manage to find a solution shorter than 33 moves. It was almost a disaster. Out of desperation, I tried solutions for the alternative 5th move, as well as all other types of diagonal debuts. Debuts were decided to the end, but nothing shorter than 39 moves per game could not be found.

Returning to the initial 7th diagonal debut, he redid the algorithm for generating sentences for attacking moves. As a result, moves leading to positions with a rating score from the third ten, suddenly began to give results and reduce the length of the solution path. The variability of the calculation with such an amount became quite large. With a solution depth of 12 moves, there were more than 2 million positions (excluding positions when searching for combinations). The continuation rested on the allocated 1GB of RAM. Thus, in order to check the solution before the final fork, in some cases it was necessary to separately resolve positions from the 18th move.

After the 7th diagonal debut was decided in a given 35 moves, you could celebrate the victory - the fight for the center was won. Ahead, there was still a large amount of routine computational work, calculations of “non-optimal” white moves for completeness of the strategy. Of the total optimal strategy, the response to such moves as a result was 80%. Fortunately, they were automatically solved completely on a preliminary calculation after the 2nd move, and all this volume was added to the optimal strategy in a couple of days.

So, solutions for all 2 moves found. We put the first black stone in the center of the field, start the search for a solution and do not even have time to feel the importance of the moment - the initial position is solved in 35 moves. The graph of the optimal winning strategy is built.

We check ourselves

The next step is to test the solution. Completely turn off the intelligence of the defending side. After each black move, White goes to any empty cell of the board. The position obtained after White’s move must either be found in the decision graph or evaluated as winning for the number of moves that do not exceed the longest branch in the initial position. When evaluating each position, we check the winning combination found by White’s admissible moves before Black draws the final piece - five in a row.

The check was performed several times to complete. The final error-free run in single-threaded mode took 14 hours. In the course of the check, errors were discovered and corrected as a result of differences in the depth of search for combinations, assumptions of skipping, duplication of symmetrical positions.

Answer the question - whether the solution in 35 moves is the most optimal. According to my research for a number of the longest branches of the vertical debut, it is possible to find more optimal solutions with a length of 33 turns. But for the diagonal after the 6th move on j9, a lot of time was spent on finding a solution of 33 moves, the variation for Black expanded to 50 moves at each step and to no avail. Strictly prove the absence of a decision in 33 moves fails, the time allocated for the project came to an end and the decision was made to stop at the target limit of 35 moves.

Convert from java to javascript

Publication of the solution to the problem requires clarity. To use the solution directly in the browser it took:

Reduce the search depth combinations in the evaluation of positions to 17 moves. This resulted in a 2-3 times increase in the number of pre-calculated moves of the optimal strategy.
Convert the binary format of the decision graph to a JSON sequence of moves. This format is more convenient in javascript and visual.
Convert java classes to javascript modules, except for the search for solutions. Here, in the web-interface, replace calls to rest-services with local functions.

List of application classes:

Board - party management on the board, visual interface
Vertex - the vertex of the decision graph, inherited from the position
Edge - the edge of the decision graph, the course of connecting positions
Layout - position, contains a collection of lines
Line - a line in predetermined directions, contains a collection of figures
Figure - figure, determines the type and the beginning of the figure on the line
Pattern - patterns of shapes to compare when searching

The complete JSON solution can be downloaded from gomoku.json .

Sources in the repository on GitHub .

For clarity, here are examples of the longest games obtained in the demonstration by clicking Next.

Diagonal Debut:

Vertical debut: