The Week in Chess

Saturday, May 31, 2014

Komodo 7a x64 vs. Stockfish, Houdini, Gull, Strelka - Gauntlet 100 Rounds

The TCEC Live Tournament Season 6 was finally over a couple of days ago and the undisputed winner is Stockfish with 7 points advantage in 64 games against Komodo 7x. The rating list sites are now updating their lists to show the current standings specially that the official Stockfish 5 which is almost similar in strength with the entry at TCEC was released a few hours ago. Some sites like sedatcanbaz.com and this site were showing Stockfish as the number 1 even before the TCEC Season 6 began. Other sites like the popular CCRL still has Houdini 4 Pro as the number 1 which was a clear loser in TCEC, ahead of the latest Komodo 7a the, runner up.

White waiting for the ongoing Stockfish 5 gauntlet matches to finish in a few days, I decided to publish my private tests of Komodo 7a done earlier against the top 4 strongest chess engines like Stockfish, Houdini, Gull and Strelka. This was to see whether the narrow ELO advantage of Komodo 7a against  Houdini 4 Pro will still hold with more games played.

The test was done in the same AMD quad core computer used in previous tests with the same tournament conditions of 1 minute base + 1 second increment time control in 100 round gauntlets done 4 times.

In the final result, the ELO ranking was maintained but the ranking while the gauntlet tournaments were in progress were not consistent for Komodo, Houdini and Gull but Stockfish was number 1 and Strelka was number 5 throughout the 4 batches. Only in the 4th batch that the ranking eventually aligned with the latest Owl Rating List. The ELO rating gap between Komodo 7a and Houdini 4 was just mere 5 ELO points which is statistically very close that it could change easily with more games.

Here is the estimated ELO strength of the contestants and their score statistics:

Rank Engine Est. ELO Raw Elo Games Score% Points Win Loss Draw Chg
1Stockfish 14051110 x64 3173.9885.7140061.13 244.5151621875.14
2Komodo 7a x64 3114.416.20160051.13 818.04313957740.36
3Houdini 4 Pro x64 3109.201.1540048.75 195.08797216-0.02
4Gull 3 x64 3083.82-10.3840047.75 191.0861042102.79
5Strelka 6 3063.85-82.6740037.88 151.571168161-3.95
.

Here are the 4 gauntlet batches that tells the performance story:

Komodo 7a x64 vs. Stockfish, Houdini, Gull, Strelka - 100RR 1M1S Batch 1
RankEngineScoreKo
1Komodo 7a x64 208.0/400· ·· ·· ··
2Stockfish 14051110 x6456.5/10032-19-49
3Houdini 4 Pro x64 51.0/10020-18-62
4Gull 3 x64 47.0/10022-28-50
5Strelka 6 37.5/10018-43-39



Komodo 7a x64 vs. Stockfish, Houdini, Gull, Strelka - 100RR 1M1S Batch 2
RankEngineScoreKo
1Komodo 7a x64 206.5/400· ·· ·· ··
2Stockfish 14051110 x6462.0/10037-13-50
3Gull 3 x64 50.5/10024-23-53
4Houdini 4 Pro x64 42.0/10014-30-56
5Strelka 6 39.0/10015-37-48


Komodo 7a x64 vs. Stockfish, Houdini, Gull, Strelka - 100RR 1M1S Batch 3
RankEngineScoreKo
1Komodo 7a x64 203.5/400· ·· ·· ··
2Stockfish 14051110 x6467.0/10046-12-42
3Gull 3 x64 48.5/10022-25-53
4Houdini 4 Pro x64 48.0/10024-28-48
5Strelka 6 33.0/10013-47-40


Komodo 7a x64 vs. Stockfish, Houdini, Gull, Strelka - 100RR 1M1S Batch 4
RankEngineScoreKo
1Komodo 7a x64 200.0/400· ·· ·· ··
2Stockfish 14051110 x6459.0/10036-18-46
3Houdini 4 Pro x64 54.0/10029-21-50
4Gull 3 x64 45.0/10018-28-54
5Strelka 6 42.0/10025-41-34


400 games played / Tournament finished

Tournament start: 2014.05.27, 08:40:07
Latest update: 2014.05.30, 16:51:10
Level: Blitz 1/1
Hardware: AMD Phenom(tm) II X4 945 Processor with 1.8 GB Memory
Operating system: Windows 7 Ultimate Professional Service Pack 1 (Build 7601) 64 bit
Table created with: Arena 3.5

Download the gauntlet matches PGN games here.
.

Tuesday, May 27, 2014

Komodo 7a x64 - Gauntlet Matches 100 Rounds

Komodo 7a x64 is a UCI Chess Engine by Don Dailey, Larry Kaufman and Mark Lefler released last May 21, 2014. It is currently a finalist at the TCEC Live Tournament against Stockfish.

Komodo 7a score 76.47% with 1179 wins, 173 losses and 548 draws against the top 19 selected strongest chess engines in the 100 rounds gauntlet matches.

Komodo defeated Houdini 4, Gull 3, Strelka 6 and the rest of the weaker engines but lost to Stockfish 14051110. The performance is good for second spot in the 100 rounds gauntlet tournament and in the Owl Computer Chess Engines Rating List - Best Versions. However, Komodo 7a is only number 3 in the simulated round robin against the 20 Top Chess Engines Selection where Houdini 4 is higher by 1 rank. The ELO rating gap between Komodo 7a and Houdini 4 is so narrow that they may interchange rank in the rating list when more games are added.

Here is the performance stats of Komodo 7a:
Rank Engine ELO Raw Games Score% Points Win Loss Draw Chg
1Stockfish 14051110 x64 3168.84242.5210056.00 56.0311950-0.50
2Komodo 7a x64 3114.05201.48190076.47 1453.011791735483114.05
3Houdini 4 Pro x64 3109.22154.9910043.00 43.0203446-3.77
4Gull 3 x64 3081.03149.8010042.50 42.51833493.69
5Strelka 6 3067.80114.7010038.50 38.5224533-1.29
6Fire 3.0 x64 2987.86106.0210036.00 36.0134146-0.46
7Critter 1.6a x64 3014.72100.1210036.50 36.52047330.06
8Equinox 2.02 x64 2972.9974.3310031.50 31.5946453.05
9Rybka 4.1 x64 2961.4229.9710026.00 26.0856360.79
10Shredder 12 x64 2833.03-41.9310017.50 17.536829-1.02
11Hannibal 1.4b x64 2831.71-68.1910015.50 15.547323-0.01
12Spike 1.4 2809.96-69.5410015.00 15.0373241.18
13Naum 4.2 x64 2782.41-76.6910014.50 14.537423-0.08
14Protector 1.6.0 x64 2841.22-77.9010014.00 14.0274240.36
15Murka 3 x64 2716.87-123.7210011.50 11.5481154.62
16Junior 13.8.04 x64 2737.18-132.1610011.00 11.048214-0.07
17Senpai 1.0 x64 2782.92-135.3910010.00 10.0282163.17
18Deep Hiarcs 14 2818.46-145.161009.00 9.018316-1.33
19DiscoCheck 5.2 x64 2711.95-150.1710010.00 10.0484122.75
20Sjeng 2010 2751.40-153.091009.00 9.0284140.75
.
Download the gauntlet matches PGN games here.

Owl Computer Chess Engines Rating List - 05/27/2014

The Owl Computer Chess Engines Rating List released, 05/27/2014.

View the full rating list here.

Wednesday, May 21, 2014

Stockfish 1405 Devs vs. Houdini, Komodo, Gull, Strelka

I've been doing private tests of some latest Stockfish development versions the last few days to just watch for possible regressions, or better yet for major ELO improvements when it would be time to update the Owl Rating List.

There were three Stockfish versions tested which were released on May 13, 17 and 19. I choose the top 4 strongest chess engines such as Houdini 4, Komodo TCEC, Gull 3 and Strelka as gauntlet sparring opponents to exercise the power of Stockfish as well as see if the rankings will follow in its natural order. Each Stockfish version faced the same opponents at the same tournament conditions of 100 rounds, 1 minute base + 1 second increment time control and running on the same hardware and software. The ELO median was arbitrarily set to 3100 in estimating the ELO rating.

The results showed that there was a slight increase of ELO in each incremental release of Stockfish development versions but not big enough to update the Owl Rating List. The ranking order of the strongest chess engines remained the same even though the test data sample volume was relatively small.

Here is the summary of all the tests combined:

Rank Engine ELO Est Games Score% Points Win Loss Draw
1 Stockfish_14051921_x64 3166.40 400 65.88 263.5 168 41 191
2 Stockfish_14051712_x64 3156.91 400 63.63 254.5 164 55 181
3 Stockfish_14051322_x64 3147.77 400 62.75 251.0 153 51 196
4 Houdini 4 Pro x64 3106.45 300 42.50 127.5 41 86 173
5 Komodo TCEC x64 3065.99 300 37.50 112.5 43 118 139
6 Gull 3 x64 3050.69 300 33.83 101.5 28 125 147
7 Strelka 6 3005.80 300 29.83 89.5 35 156 109


Here are the individual gauntlet tournaments:

Stockfish 14051921 x64 vs. Houdini, Komodo, Gull, Strelka - Test Gauntlet Match 100R 1M1S
RankEngineScoreSt
1Stockfish_14051921_x64263.5/400· ·· ·· ··
2Houdini 4 Pro x64 40.5/10013-32-55
3Komodo TCEC x64 36.0/10010-38-52
4Gull 3 x64 32.0/1009-45-46
5Strelka 6 28.0/1009-53-38



Stockfish 14051722 x64 vs. Houdini, Komodo, Gull, Strelka - Test Gauntlet Match 100R 1M1S
RankEngineScoreSt
1Stockfish_14051712_x64254.5/400· ·· ·· ··
2Houdini 4 Pro x64 43.0/10015-29-56
3Komodo TCEC x64 36.5/10015-42-43
4Gull 3 x64 35.0/10010-40-50
5Strelka 6 31.0/10015-53-32



Stockfish 14051322 x64 vs. Houdini, Komodo, Gull, Strelka - Test Gauntlet Match 100R 1M1S
RankEngineScoreSt
1Stockfish_14051322_x64251.0/400· ·· ·· ··
2Houdini 4 Pro x64 44.0/10013-25-62
3Komodo TCEC x64 40.0/10018-38-44
4Gull 3 x64 34.5/1009-40-51
5Strelka 6 30.5/10011-50-39


400 games played / Tournament finished

Tournament start: 2014.05.16, 11:32:34
Latest update: 2014.05.17, 03:14:08
Level: Blitz 1/1
Hardware: AMD Phenom(tm) II X4 945 Processor with 1.8 GB Memory
Operating system: Windows 7 Ultimate Professional Service Pack 1 (Build 7601) 64 bit
Table created with: Arena 3.5

Download the gauntlet match games PGN here.

Tuesday, May 13, 2014

Stockfish 14051110 x64 - Gauntlet Matches, 100 Rounds

Stockfish 14051110 x64 is a UCI chess engine by +Marco Costalba+Joona Kiiski and +Tord Romstad released last May 11, 2014.

Stockfish scored 82.58% with 1321 wins, 83 losses and 496 draws against the selection of 19 strongest chess engines in the 100 rounds gauntlet matches. It defeated all the opponents in which it earned 3169 ELO rating points and retained the number ONE rank in the Owl Rating List. Stockfish gained 21 ELO rating points from the previous version which is good enough to lead by 56 ELO points against the next strongest, Houdini 4. The guesstimate of at least 30 ELO points increase was off target which could be attributed by the fact that it was based only on self-test.


Here is the performance statistics of Stockfish 14051110:

Rank Engine ELO Raw Games Score% Points Win Loss Draw Chg
1 Stockfish 14051110 x64 3169 251 1900 82.58 1569.0 1321 83 496 3169
2 Houdini 4 Pro x64 3113 179 100 39.00 39.0 10 32 58 -2
3 Gull 3 x64 3077 157 100 36.00 36.0 9 37 54 3
4 Komodo TCEC x64 3082 144 100 34.50 34.5 11 42 47 -2
5 Fire 3.0 x64 2988 103 100 28.00 28.0 4 48 48 -2
6 Critter 1.6a x64 3015 102 100 29.50 29.5 11 52 37 -2
7 Strelka 6 3069 99 100 30.00 30.0 15 55 30 -2
8 Rybka 4.1 x64 2961 47 100 22.50 22.5 8 63 29 -2
9 Equinox 2.02 x64 2970 34 100 21.50 21.5 9 66 25 -3
10 Shredder 12 x64 2834 -42 100 11.50 11.5 0 77 23 -2
11 Deep Hiarcs 14 2820 -49 100 11.00 11.0 0 78 22 -2
12 Spike 1.4 2809 -54 100 11.00 11.0 1 79 20 -1
13 Protector 1.6.0 x64 2841 -55 100 10.50 10.5 0 79 21 -1
14 Hannibal 1.4b x64 2832 -69 100 9.50 9.5 0 81 19 -2
15 Naum 4.2 x64 2782 -76 100 9.00 9.0 0 82 18 -3
16 Senpai 1.0 x64 2780 -99 100 8.00 8.0 1 85 14 -2
17 Murka 3 x64 2712 -126 100 7.00 7.0 2 88 10 -3
18 Junior 13.8.04 x64 2737 -162 100 5.00 5.0 1 91 8 -3
19 DiscoCheck 5.2 x64 2709 -175 100 4.50 4.5 1 92 7 -4
20 Sjeng 2010 2751 -208 100 3.00 3.0 0 94 6 -1
.
Download the gauntlet matches PGN games here.

Chessdom News