Teaching a computer to play the game of Go is difficult

Go is a 4000 year old game where each player tries to form borders around other players’ rooms to destroy them. The winner is the person who has captured the most spaces on the board when both players pass. The game presents a challenge for artificial intelligence (AI) researchers due to its complexity.

It is difficult to assess the quality of a particular Go movement. Traditional AI play strategies don’t work as well with Go as they do with other games like chess, because traditional AI play strategies rely on an algorithm called “Minimax”. Minimax is a strategy that finds the best outcome for a move you want to make, assuming the other player is trying to find the outcome that thwarts you the most.

The downside of Minimax is that it needs a motion evaluator to function properly. In Go, however, Minimax does not perform well as Go does not have a good way of evaluating movements due to its complexity. If Minimax is used with a method that doesn’t require a good motion evaluator, then it works much better.

A mathematical technique called Monte-Carlo is a beneficial strategy for Go. Monte-Carlo (in its simplest form) checks for random moves, finds out which moves lead to wins, and chooses the best move based on that. In the more complicated versions, Monte-Carlo also selects moves by checking his previous knowledge on that move or on similar moves, acquired through his playing experience.

A variant of Monte-Carlo, called Objective Monte-Carlo, has a different motion evaluator. Objective Monte-Carlo will pick a shot if it gives itself a good chance to improve its position, a method that does not require a good stroke evaluation function. Depending on the quality of the determined move, the computer assigns a score called “Ob” to this move. The AI ​​player then adjusts according to his victory or defeat. In other words, if Ob is high, the AI ​​will explore riskier moves. If it is decreased, it will play safer moves.

The move selection method chooses the next move with increasing odds based on the assessment of the move. This strategy, called back propagation, weights the moves according to the performance of the move and the total number of games played with that move. Objective Monte-Carlo, with its specialized backpropagation function, is more efficient than Minimax coupled with Monte-Carlo.

Objective Monte-Carlo is a strategy that works even for games other than Go. It does not require a means of evaluating movements, which means that it can be applied to games that are difficult to create a function of. ‘evaluation of movements. Because it does not need to explore all the possibilities, it can be more easily extended. This could make it a much better alternative to other AI strategies for other games.


Source link

Comments are closed.