Prophet4 finally has a proper testing rig! A few weeks ago, I purchased a Dell Alienware system — an 8 core (16 logical) AMD Ryzen 7 5800 with 32 GB of RAM and an AMD Radeon RX600XT graphics card.
This replaces the single core laptop I have been using. This is pretty exciting as it will allow P4 testing to go at 8x the speed it did before. Just to break the machine in, I ran the first ever gauntlet with P4. Here are the results:
Rank | Name | Elo | Games | Score | Draws |
---|---|---|---|---|---|
1 | plisk 0.2.7d | 103 | 29785 | 66% | 19% |
2 | tjchess 1.3 | 81 | 29787 | 63% | 22% |
3 | jazz 840 | 52 | 29786 | 58% | 21% |
4 | myrddin 0.87 | 44 | 29786 | 57% | 21% |
5 | Horizon 4.4 | 15 | 29785 | 52% | 19% |
6 | jumbo 0.4.17 | -5 | 29783 | 49% | 21% |
7 | p3-20181124 | -8 | 29786 | 48% | 25% |
8 | p4-20210407 | -86 | 17211 | 36% | 32% |
9 | prophet2_ja | -94 | 16000 | 35% | 18% |
10 | tcb 0052 | -182 | 29785 | 23% | 15% |
This closely matches the results I obtained a few years ago when I announced Prophet3 20180811 is released . One notable exception is TCB – it seems to have done much worse on this machine, for whatever reason.
So, it seems P4 is already on par with P2, and is within 78 elo or so from P3. This is encouraging, as the P4 rewrite is still not complete. I’m feeling pretty confident that when it is, it will be at least as strong as P3, and then the real work of improving can begin. And, with the fast feedback from a proper testing rig, I don’t think that should be all that difficult. My goal is to achieve +100 elo over P3 before doing a release.
Note– the versions of the chess engines used in this gauntlet are quite old by now; many of them likely have updates that could be significantly stronger. During the rewrite process, I’ve tried to keep everything “as is” — the goal is to compare P4 to P3, not to other engines. Once the rewrite is complete and P4 grows in strength, new testing engines will be cycled in as the current set is cycled out.