And Then There Were Two

Komodo 9, written by Don Dailey, Larry Kaufman and Mark Lefler. Available (1) with Fritz GUI from Amazon ($80ish as of 5/28), (2) for download with Fritz GUI from ChessBase.com ($73.50 w/o VAT as of 5/28) and (3) directly from the Komodo website without GUI for $59.98; also available as part of a 1 year subscription package for $99.97.

Stockfish 6, written by the Stockfish Collective. Open-source and available at the Stockfish website.

—–

Now that Houdini seems to have gone gentle into that good night, there are two engines vying for the title of strongest chess engine in the world. Those two engines – Stockfish and Komodo – have each seen new releases in recent months. Stockfish 6 was released at the end of January, while Komodo 9 became available at the end of April from komodochess.com and the end of May from ChessBase.

Last year I wrote a review of Komodo 8 and Stockfish 5 that was republished at ChessBase.com, and much of what I wrote there applies here as well. Fear not, frazzled reader: you don’t need to go back and read that review, as most of the key points will be reiterated here.

First things first: any top engine (Komodo, Stockfish, Houdini, Rybka, Fritz, Hiarcs, Junior, Chiron, Critter, Equinox, Gull, Fire, Crafty, among many others) is plenty strong to beat any human player alive. This is not because each of these engines are equally strong. While they don’t always play the absolute best moves, none of the aforementioned engines ever make big mistakes. Against fallible humans, that’s a recipe for domination. It’s nearly useless – not to mention soul-crushing! – to play full games against the top engines, although I do recommend using weaker engines (Clueless 1.4, Monarch, Piranha) as sparring partners for playing out positions or endgames.

Even if all the major engines can beat us, they’re not all created equal. Three major testing outfits – CCRL, CEGT, and IPON – engage in ongoing and extensive testing of all the best engines, and they do so by having the engines play thousands of games against one another at various time controls. In my previous review I noted that Komodo, Stockfish and Houdini were the top three engines on the lists, and in that order. This remains the case after the release of Komodo 9 and Stockfish 6:

CCRL (TC 40 moves/40 min, 4-cpu computers):
1. Komodo 9, 3325 (Komodo 8 was rated 3301)
2. Stockfish 6, 3310 (Stockfish 5 was rated 3285)
3. Houdini 4, 3269

CEGT
40/4: 1. Komodo 9, 2. Stockfish 6, 3. Houdini 4
G/5’+3”: 1. Komodo 9, 2. Stockfish 6, 3. Houdini 4
40/20: 1. Komodo 9, 2. Stockfish 6, 3. Houdini 4 (NB: list includes multiple versions of each engine)
40/120: 1. Stockfish 6, 2. Komodo 8 (does not yet include version 9), 3. Houdini 4 (NB””: list includes multiple versions of each engine)

IPON
1. Komodo 9, 3190 (Komodo 8 was 3142)
2. Stockfish 6, 3174 (Stockfish 4 was 3142)
3. Houdini 4, 3118

The results are fairly clear. Komodo 9 is ever so slightly stronger than Stockfish 6 when it comes to engine-engine play, and this advantage seems to grow when longer time controls are used.

For my purposes, though, what’s important is an engine’s analytical strength. This strength is indicated by engine-engine matches, in part, but it is also assessed through test suites and – perhaps most importantly – by experience. Some engines might be more trustworthy in specific types of positions than others or exhibit other misunderstandings. Erik Kislik, for instance, reports in his April 2015 Chess Life article on the TCEC Finals – some of which appeared in his earlier Chessdom piece on TCEC Season 6 – that only Komodo properly understood the imbalance of three minor pieces against a queen. There are undoubtedly other quirks known to strong players who use engines on a daily basis.

In my previous review I ran Komodo, Stockfish and Houdini (among others) through two test suites on my old Q8300. Since then I’ve upgraded my hardware, and now I’m using an i7-4790 with 12gb of RAM and an SSD for the important five and six-man Syzygy tablebases included with ChessBase’s Endgame Turbo 4. (Note: if you have an old-fashioned hard drive, only use the five-man tbs in your search; if you use the six-man, it will slow the engine analysis down dramatically.) Because I have faster hardware I thought that a more difficult test suite would be in order, and – lucky me! – just such a suite was recently made available in the TalkChess forums. I gave Komodo 9 and Stockfish 6 one minute per problem to solve the 112 problems in the suite, and the results were as follows:

Komodo 9 solved 37 out 110 problems (33.6%) with an average time/depth of 20.04 seconds and 24.24 ply. Stockfish 6 solved 30/110 (27.2%) with an average time/depth of 20.90 seconds and 29.70 ply. (Note that while there are 112 problems in the suite, two of them were rejected by both engines because they had incomplete data.) The entire test suite along with embedded results can be found at:

http://www.viewchess.com/cbreader/2015/6/6/Game1753083657.html

I have also been using both Komodo 9 and Stockfish 6 in my analytical work and study. So that you might also get a feeling for how each evaluates typical positions, I recorded a video of the two at work.  Each engine ran simultaneously (2 cpus, 2gb of RAM) as I looked at a few games of interest, most of which came from Alexander Baburin’s outstanding e-magazine Chess Today. The video is 14 minutes long. You can replay the games at this link:

http://www.viewchess.com/cbreader/2015/6/6/Game1752975735.html

Komodo 9 and Stockfish 6 in comparative analysis

Even a brief glance at the above video will make clear just how good top engines are becoming in their ability to correctly assess positions, but it also shows (in Gusev-Averbakh) that they are far from perfect. They rarely agree fully in positions that are not clear wins or draws, and this is due to the differences in evaluation and search between the two. Broadly speaking, we can say that evaluation is the criteria or heuristics used by each engine to ‘understand’ a position, while search is the way that the engine ‘prunes’ the tree of analysis. While many engines might carry similar traits in their evaluation or search, none are identical, and this produces the differences in play and analysis between them.

Stockfish 6 is a rather deep searcher. It achieves these depths through aggressive pruning of the tree of analysis. While there are real advantages to this strategy, not the least of which is quick analytical sight and tactical ingenuity, there are some drawbacks. Stockfish can miss some resources hidden very deep in the position. I find it to be a particularly strong endgame analyst, in part because it now reads Syzygy tablebases and refers to them in its search. Stockfish is an open-source program, meaning that it is free to download and that anyone can contribute a patch, but all changes to evaluation or search are tested on a distributed network of computers (“Fishtest”) to determine their value.

Komodo 9 is slightly more aggressive in its pruning than is Komodo 8, and it is slightly faster in its search as well. (Both changes seem to have been made, to some degree, with the goal of more closely matching Stockfish’s speed – an interesting commercial decision.) While Komodo’s evaluation is, in part, automatically tuned through automated testing, it is also hand-tuned (to what degree I cannot say) by GM Larry Kaufman.

The result is an engine that feels – I know this sounds funny, but it’s true – smart. It seems slightly more attuned to positional nuances than its competitors, and as all the top engines are tactical monsters, even a slight positional superiority can be important.  I have noticed that Komodo is particularly good at evaluating positions where material imbalances exist, although I cannot say exactly why this is the case!

As more users possess multi-core systems, the question of scaling – how well an engine is able to make use of those multiple cores – becomes increasingly important. Because it requires some CPU cycles to hand out different tasks to the processors in use, and because some analysis will inevitably be duplicated on multiple CPUs, there is not a linear relation between number of CPUs and analytical speed.

Komodo 8 was reputedly much better than Stockfish 5 in its implementation of parallel search, but recent tests published on the Talkchess forum suggest that the gap is narrowing. While Stockfish 6 sees an effective speedup of 3.6x as it goes from 1 to 8 cores, Komodo 9’s speedup is about 4.5x. And the gap is further narrowed if we consider the developmental versions of Stockfish, where the speedup is now around 4x.

Hardcore engine enthusiasts have, as the above suggests, become accustomed to downloading developmental versions of Stockfish. In an effort to serve some of the same market share, the authors of Komodo have created a subscription service that provides developmental versions of Komodo to users. This subscription, which costs $99.97, entitles users to all official versions of Komodo released in the following year along with developmental versions on a schedule to be determined. Only those who order Komodo directly from the authors are currently able to choose this subscription option.

The inevitable question remains: which engine should you choose? My answer is the same now as it was in my previous review. You should choose both – and perhaps more.

Both Komodo and Stockfish are insanely strong engines. There remain some positions, however, where one engine will get ‘stuck’ or otherwise prove unable to discern realistic (i.e. human) looking moves for both sides. In that case it is useful to query another engine to get a second (or perhaps even third) opinion. I find myself using Komodo 9 more than Stockfish 6 in my day-to-day work, but your mileage may well vary. Serious analysts, no matter their preference, will want to have both Komodo 9 and Stockfish 6 as part of their ‘teams.’

Advertisements

4 thoughts on “And Then There Were Two

  1. Chuck

    Actually, I found your statement that you found Kimodo to be “an engine that feels smart” to be quite descriptive and helpful in evaluating purchases among the maze of chess engines. Thanks for another informative and very applicable review.

    Reply
  2. mccreadyandchess

    You should do a lot more with your website seeing as you are not on the payroll of a publishing company. A more honest approach is always refreshing. I did have a bash at writing reviews myself but I am a bit too cynical for that lark.

    Nice site, easy to read.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s