Several months ago I decided to get a little more rigorous about how I test changes to Prophet. With previous versions of Prophet, I would make a change, run some test suites consisting of a few hundred or a thousand tactical positions, play a few games online to convince myself the change was good, and that was it. That doesn’t really cut it any more though. Fruit and other engines that have followed suit have shown the importance of a having a solid testing methodology that accurately measures how effective changes are. So, to that end, I found some sparring partners and took my first measurements:
Rank | Name | Elo | games | score | oppo. | draws |
---|---|---|---|---|---|---|
1 | matheus | 101 | 7360 | 67% | -20 | 16% |
2 | prophet2 | 86 | 7360 | 64% | -17 | 14% |
3 | lime | 35 | 7360 | 56% | -7 | 12% |
4 | gerbil | -25 | 7360 | 46% | 5 | 15% |
5 | prophet3-20160903 | -54 | 8000 | 41% | 12 | 17% |
6 | elephant | -138 | 7360 | 28% | 27 | 11% |
Then Christmas happened, and I found myself with a little spare time to work on Prophet. I implemented bitboards, and then magic bitboards, and a few other speed optimizations. Suddenly it looked like I needed some new sparring partners!
Rank | Name | Elo | games | score | oppo. | draws |
---|---|---|---|---|---|---|
1 | prophet3-20170118 | 91 | 24436 | 68% | -42 | 17% |
2 | matheus | 65 | 10648 | 58% | 5 | 20% |
3 | prophet2 | 23 | 10649 | 51% | 10 | 16% |
4 | lime | -24 | 10649 | 44% | 17 | 13% |
5 | gerbil | -81 | 10643 | 35% | 24 | 14% |
6 | elephant | -191 | 10648 | 21% | 39 | 10% |
Here is what I ended up with. I like this a lot, as Prophet3 is right in the middle with some engines below and above. That seems like a pretty good cross section. As Prophet3 continues to improve, I’ll just add newer strong engines to the mix and drop the bottom ones off.
Rank | Name | Elo | games | score | oppo. | draws |
---|---|---|---|---|---|---|
1 | myrddin | 84 | 4811 | 63% | -7 | 18% |
2 | tcb | 46 | 4810 | 57% | -4 | 18% |
3 | Horizon | 30 | 4812 | 55% | -2 | 17% |
4 | jumbo | 19 | 4972 | 53% | -2 | 18% |
5 | prophet3 | 7 | 12733 | 51% | -2 | 22% |
6 | madeleine | -26 | 4972 | 46% | 4 | 20% |
7 | matheus | -36 | 4812 | 44% | 4 | 20% |
8 | Beowulf | -67 | 4812 | 39% | 7 | 20% |
9 | prophet2 | -71 | 4812 | 39% | 7 | 18% |