Sequential Environment Review (Subject: Solitaire Poker)

[Sequential Environment Review], [First Prize], [Second Prize], [Honorable Mention 1], [Honorable Mention 2], [Honorable Mention 3]

Sequential Environment Review:

The subject of this year for the sequential environment category, solitaire poker, is a combinatorial optimization problem with large search spaces. Since fully examining all possible settings is almost impossible, an approximation method is required. Probably due to the relative difficulty of the subject, only eight programs were submitted in this category, seven of which worked without problems. Various interesting ideas can be found in the submitted programs. The average level of the applicants was much higher than last year, affording the committee members much enjoyment of the judgement process.

Applicants used one or more combination of the following methods.
- Heuristics to find high-point hands
- Full search in spaces narrowed by heuristics
- Improvements by hill climbing methods
- Improvements by simulated annealing methods

Evaluation was made observing how each candidate improved the scores of the solutions as computation proceeded, for a card set artificially selected to achieve a high score and several sets randomly generated. Depending on the algorithms employed, submitted programs showed different behavior: Some obtained high scores quite rapidly, while others started slowly but made steady improvements. Final judgements were made taking into account both the time required to achieve reasonable scores and the final scores obtained.

[First Prize] : Mr. Yokoyama Daisaku
Department of Information and Communication Engineering, The Faculty of Engineering, The University of Tokyo

The program winning the first prize employs a simulated annealing algorithm based on exchanging the positions of two cards, starting from an initial state decided by heuristics. Initial temperature of 30,000 points, with a cooling rate of 0.95 for every 100 exchanges is used. When the temperature falls by 100 points, it restarts annealing at the initial temperature.

This program obtained good scores for all the card sets we measured. For card sets that can only achieve low scores, it scores slow in attaining reasonable scores, but gradually raises its score, finally reaching the top score among the entries. For sets that can achieve high scores, this program reaches reasonable scores quite early. It is finally outstripped, however, by some others.

The heuristics used for deciding the initial state are good for finding high-point hands, but not for low-point hands. This may be the reason of its slow startup for low-score sets. The annealing method seems to work fairly well for all tested card sets. The reason for being overtaken ultimately in some card sets may be that the 100 point threshold for restarting the annealing process is a little too large. However, this threshold setting should be profitable in regularly scoring good points.

Room for improvement may be in deciding the constants for annealing. For example, gradually lowering the threshold for iteratively restarting the annealing process might provide finer tuning.

[Second Prize] : Mr. Kurita Hideaki
Tokyo National College of Technology

A program using a hill-climbing method won the second prize. The initial state is given by simple heuristics, and a hill-climbing search is done based on exchanging the positions of two cards. Candidate exchanges are decided randomly. A distinctive feature of this program is that ten such hill-climbing searches from different initial states are tried concurrently, and the best among them is output as the score. When the score of one search becomes much lower than the best score so far (more than 100,000 points, which is written as a constant in the program), that search is abandoned and a new search is initiated from the configuration with the best score. The initial states are designed so that straight flushes can be found easily, and the magic number ten of concurrent searches is natural under the heuristics.

This program is slower than other prize-winning programs in finding initial configurations with reasonably high scores. This may be because simple heuristics for the initial states is not enough; it considers only straight flushes and no others. Also, ten concurrent searches take more time than a single search. However, for card sets that can only attain low scores, this program marks the highest scores within the measurement period. This program also marked good scores for card sets with high scores. Concurrent hill-climbing searching appears to reduce the risk of becoming stranded at local maxima. Another strong point of this algorithm is that it can easily be extended to a parallel version.

Using better heuristics for the initial state may significantly improve scores in the early stages. The number of concurrent searches, currently depending completely on the heuristics used, may be tuned for better results.

[Honorable Mention]

As the three other programs have both merits and demerits, they all are chosen for honorable mentions.

Honorable Mention 1: Mr. Tsuruoka Yoshimasa
Department of Information & Communications Engineering, The Faculty of Engineering, The University of Tokyo

Honorable Mention 1 uses a hill-climbing method, starting from a initial state decided by heuristics, using two sets of exchanges of card positions (four cards are moved) as the basic operation. It tries all the possible combinations of two exchanges, except that it won't try a second exchange when when the first exchange lowers the score too much (more than 22,000 points).

The heuristics for the initial state used in this program can result in relatively high scores early on. If we had run the programs for only short time periods, this one might have obtained the first prize. However, compared with other methods such as simulated annealing, this simple hill-climbing search seems to be more easily trapped at local maxima. Its score does not improve much over an extended period, sometimes allowing other programs to overtake it by a large margin. However, its basic operation of two exchanges seems to have significantly lowered the possibility of being caught in local maxima when compared to one-exchange operation schemes.

Honorable Mention 2: Mr. Kono Youichi
Others (Programmer)

Honorable Mention 2 attempts simulated annealing using an exchange of two card positions as the basic operation, starting from a random initial state. The initial temperature is 3,0000 points and cooling rate, fixed in the program, is 0.85. To set this number, the author carried out many experiments, which are described in detail in the attached documents.

The initial state is decided randomly without any heuristics. This makes it slower than others in attaining reasonably high scores. The annealing process seems to work well. The final scores within the measurement time for most tested card sets closely approach those of the program winning the first prize.

Honorable Mention 3: Mr. Hiruta Kazuhiro
School of Electronic Engineering, Faculty of Engineering, Kyoto University

Honorable Mention 3 decides high score hands for all the rows using heuristics first, and then searches all the possibilities of exchange of columns within rows. It is a full search within a search space limited by heuristics.

The heuristics for deciding rows looks good, obtaining reasonable scores within short periods of time. Considering only very shortcomputation time, such as one second, this program is one of the best. However, because the search space is completely limited by heuristics, its score is gradually overtaken by other programs. Since the problem did not mention any concrete time limits, the program should have tried some other methods for improvement after the search within the limited search space.

Sequential Environment Review