I’ve been working on “the big rewrite” of my C chess engine lately. I mentioned some time back that it was just barely capable of playing a legal game of chess. At the time it did a depth first search using a depth of 4. Since then, I’ve added a timing algorithm. The idea is that, we don’t necessarily want to search to a fixed depth — we want to figure out how much time we can afford to spend on the next move, and then search as deeply as we can in that time, because generally the deeper your search tree the stronger your move.
The problem is, you don’t really know ahead of time how deep you’re going to be able to search in the allotted time. The approach taken by Prophet (and virtually every other chess program that I know of) is to use iterative deepening. First, search to depth 1, then depth 2, then depth 3, and so on.
In the output below, there is one line each time Prophet changes its mind about the ‘principal variation’ (what it thinks the optimal line of play is given that each side is trying to win), and one line at the end of each iteration. The score represents ‘centipawns’ — so 507 == “5.07 pawns.” Time is recorded in centiseconds. It may seem odd to use centiseconds as opposed to milliseconds or just seconds, but that’s the XBoard protocol. In this example Prophet completed the depth 5 search and started but did not fully complete the depth 6 search:
DEPTH SCORE TIME NODES PV 1 219 0 2 a8b8 1 224 0 14 c7c6 1 231 0 15 d7d6 1 234 0 16 d7d5 1 629 0 20 a6c4 1 629 0 40 a6c4 2 214 0 41 a6c4 d1d4 2 214 0 187 a6c4 d1d4 3 319 0 364 a6c4 e2c4 f4g2 3 527 0 1183 a8c8 c4a5 a6e2 3 532 1 4127 c7c6 c4a5 a6e2 3 1024 1 7877 d4e2 c3f3 a6c4 3 1029 2 8924 f4e2 c3d3 a6c4 3 1029 2 9173 f4e2 c3d3 a6c4 4 614 3 18415 f4e2 c3d3 a6c4 d3d4 4 614 6 23954 f4e2 c3d3 a6c4 d3d4 5 922 22 139575 f4e2 c4a5 c5b6 c3e3 b6a5 5 922 47 383672 f4e2 c4a5 c5b6 c3e3 b6a5 6 507 242 1472162 f4e2 c4a5 c5b6 c3e3 b6a5 d1d4 # Aborting search depth=1, ply=5 on time... move f4e2
At first glance this may seem like an awfully inefficient solution. After all, search time grows exponentially with depth. If we knew we were likely to search to depth 6 (or 10, or whatever), wouldn’t it be more efficient to just skip the previous searches? Surprisingly, no. That is, not if your program uses knowledge from each search to guide the next one.
At the moment the only apriori knowledge that Prophet uses is the principal variation itself. When Prophet begins a depth N search, the principal variation from depth N-1 will be tried first. This quickly establishes tight alpha/beta bounds which helps “cutoff” useless branches of the search tree. It also avoids the situation in which depth N-1 says move A is best, and depth N says move B is best (but hasn’t gotten around to searching move A yet) when time runs out. What do you do?
The next improvement will be to add some hashing. Hashing will significantly improve the “sharing of information” from one search to the next. The main challenge here is producing a good hashing key (signature). The key should be fairly unique — there should be very little chance of two distinct positions producing the same key. Additionally, the keys generated should be uniformly distributed over the solution space. And, it’s very important that it be fast!
That said, I think I’m going to put a temporary hold on the development of the engine itself and work on building a bit of infrastructure. I’ll describe what I have in mind in a separate post, soon.
How do the weighted randomized searches compare? Specifically, I’m wondering about a randomized algorithm that that exponentially favors short moves? Of course the DFS is a good way to go, I’m not at all suggesting that is a bad route.
It seems to be that the unpredictableness of randomisation is ideal for the offensive component, doing something that the other player did not consider. Although for defense there are clear and significant drawbacks, namely not taking a very obvious move.
Suppose each level of depth reduces the likelihood of random choice(weight) by 5x. Then moves the search space would not be fully enumerated but with very high probability you’d finish the first level of moves, hopefully protecting against the major defensive flaw of randomization. Anyway, I’m just curious if this strategy is at all reasonable or what similar approaches have been taken.