Note: This review has been updated as of 9/24 to reflect my testing and experience with the newly released Komodo 8.
Houdini 4, written by Robert Houdart. Standard (up to six cpu cores, $79.95 list) and Pro (up to 32 cpu cores, $99.95 list) versions with Fritz GUIs available. Also available directly from the Houdini website for approximately $52 (Standard) or $78 (Pro) as of 9/11/14.
Komodo 7a, written by Don Dailey, Larry Kaufman and Mark Lefler. Available directly from the Komodo website for $39.95.
Stockfish 5, written by the Stockfish Collective. Open-source and available at the Stockfish website.
Increasingly I’m convinced that a serious chess player must make use of chess technology to fully harness his or her abilities. This, as I have previously discussed, involves three elements: the GUI, the data, and the engine. ChessBase 12 is the gold standard for chess GUIs, and I will be reviewing a new book about proper use of that GUI in the near future. Here, however, I want to take up the thorny issue of choosing a chess engine. Which engine is ‘best’ for the practical player to use in his or her studies?
I put ‘best’ in scare-quotes because there are two ways to look at this question. (1) There is little question at this point that the best chess engines of the past five years can beat 99.9% of human players on modern hardware. So one way that engines are tested now is in a series of engine vs engine battles. While many people process private matches, there are three main public rating lists: IPON, CCRL and CEGT.
Here there is something of a consensus. Houdini, Stockfish and Komodo are the three top engines at the moment, with very little differentiating between them, and with the particular order of the engines varying due to time control and other criteria.
Update: The three lists mentioned above have tested Komodo 8.
- It is in first place on the IPON list, leading Stockfish 5 by 6 elo points and Houdini 4 by 17.
- Komodo 8 appears on two of the CCRL lists. In games played at a rate of 40 moves in 4 minutes (40/4), Stockfish 5 leads Komodo 8 by 7 elo points and Houdini 4 by 30 elo points. In games played at the slower rate of 40 moves in 40 minutes (40/40), Komodo 8 has a 22 elo point lead on Stockfish 5 and a 39 point lead on Houdini.
- Among the many CEGT lists, we find: (a) Stockfish 5 is first on the 40/4 list, followed by Komodo 8 and Houdini 4; (b) Houdini 4 leads the 5’+3″ list, followed by Stockfish 5 and Komodo 8; (c) Komodo 8 leads the 40/20 list followed by Stockfish 5 and Houdini 4; but (d) the 40/120 list has not yet been updated to include Komodo 8.
- Note: Larry Kaufman compiles the results from these lists and one other in a thread at Talkchess. He argues (a) that Komodo does better at longer time controls, and that (b) Komodo 8 is roughly equal in strength to the Stockfish development releases, which are slightly stronger than the officially-released Stockfish 5. </update>
From my perspective, however, (2) analytical strength is more important. If all the engines are strong enough to beat me, I think that the quality of their analysis – the ‘humanness’, for lack of a better word – is critical. It used to be the case that humans could trick engines with locked pawn chains, for example, or that engines would fail to understand long-term compensation for exchange sacrifices. Such failings have largely been overcome as the engines and hardware have improved; nevertheless, there remain certain openings and types of positions that are more problematic for our metal friends. Michael Ayton offers one such position in the ChessPub forums; if you want have a laugh, check out the best lines of play on offer by the engines reviewed here:
FEN: r1b2rk1/pp1nqpbp/3p1np1/2pPp3/2P1P3/2N1BN2/PP2BPPP/R2Q1RK1 w – c6 0 10
Among the multiple engines available, there are three that stand above the fray. These are Houdini by Robert Houdart, Komodo by the late Don Dailey, Larry Kaufman and Mark Lefler, and Stockfish. Houdini and Komodo are commercial engines, while Stockfish is open-source and maintained by dozens of contributors.
How can we understand the differences between the engines? Let’s consider two key components of chess analysis: search and evaluation. Search is the way that the engine ‘prunes’ the tree of analysis; because each ply (move by White or Black) grows the list of possible moves exponentially, modern engines trim that list dramatically to obtain greater search depth. Evaluation is the set of criteria used by the engine to decipher or evaluate each position encountered during the search.
In a very general sense, what differentiates Houdini, Komodo and Stockfish are their search and evaluation functions. How they are different on a technical / programming level, I cannot say: Houdini and Komodo are closed-source and I can’t decipher code in any event. What I can do, however, is cite what some experts in the field have said, and then see if it coheres with my experience of the three engines.
Larry Kaufman, who works on Komodo, said in an interview on the Quality Chess blog that:
Komodo is best at evaluating middlegame positions accurately once the tactics are resolved. Stockfish seems to be best in the endgame and in seeing very deep tactics. Houdini is the best at blitz and at seeing tactics quickly. Rybka is just obsolete; I like to think of Komodo as its spiritual desceendant, since I worked on the evaluation for both, although the rest of the engines are not similar. Fritz is just too far below these top engines to be useful.
…Komodo’s assessment of positions is its strong point relative to the other top two, Houdini best for tactics, Stockfish for endgames and whenever great depth is required. Both Houdini and Stockfish overvalue the queen, Komodo has the best sense for relative piece values I think. Komodo is also best at playing the opening when out of book very early.
Stockfish is, as Kaufman suggests, very aggressive in the way that it prunes the tree of analysis, searching very deeply but narrowing as the ply go forward. It is important to remember that each engine reports search depth and evaluation differently, so that (as Erik Kislik writes in a fascinating article on the recent TCEC superfinal) the way that Stockfish ‘razors’ the search means that its reported depth can’t be directly compared to Houdini or Komodo. Still, it does seem to search more deeply, if narrowly, than do its competitors. This has advantages in the endgame and in some tactical positions.
Houdini is a tactical juggernaut. It tends to do best on the various tactical test sets that some engine experts have put together, and it is fairly quick to see those tactics, making it useful for a quick analysis of most positions. Its numerical evaluations also differ from other engines in that they are calibrated to specific predicted outcomes.
A +1.00 pawn advantage gives a 80% chance of winning the game against an equal opponent at blitz time control. At +2.00 the engine will win 95% of the time, and at +3.00 about 99% of the time. If the advantage is +0.50, expect to win nearly 50% of the time. (from the Houdini website)
Kaufman argues that his engine, Komodo, is the most positionally accurate of the three, and I don’t disagree. Kaufman is involved in the tuning of Komodo’s evaluation function; as he is a grandmaster, it does not seem outrageous to believe that his engine’s positional play might benefit from his chess expertise. The engine is slightly ‘slower’ (anecdotally, and not judging by NPS, or nodes per second, and ply count) than are Stockfish and Houdini, but Komodo seems to benefit more from longer analysis time than do Houdini or Stockfish.
I’ve been using Komodo 8 in the Fritz GUI from ChessBase for a few days now. The GUI is the same as the Houdini 4 and the Deep Fritz 14 GUIs; in fact, when you install Komodo 8, I think it just adds some configuration files to your ChessProgram14 folder to allow for a Komodo ‘skin’ to appear. The Komodo 8 engine is slightly faster than 7a judging solely by NPS. While coding changes mean that the two can’t be directly compared, Mark Lefler has said that 8 is approximately 9% faster than 7a. The ChessBase package comes with a 1.5 million game database, an opening book, and a six month Premium membership at Playchess.com; all are standard for Fritz GUI releases such as Deep Fritz 14 or Houdini 4.
From my perspective, I tend to use all three engines as I study chess or check analysis for review purposes, but two more than the third. When I look at my games, which aren’t all that complex, I generally use Houdini as my default kibitzer. It seems to be the fastest at seeing basic tactical problems, and its quickness is a plus on some of my antiquated computers. I also tend to bring Komodo into the mix, especially if I want to spend some time trying to figure out one position. Stockfish serves more as a second (or third) option, but I will use it more heavily in endgame positions – unless we get into tablebase territory, as Stockfish does not (generally) use them.
As I was working on this review, I thought that I might try to ‘objectively’ test the engines on positions that were more positional or prophylactic in nature, or perhaps in some difficult endgame positions. I took 11 positions from books on hand, including a number from Aagaard’s GM Preparation series, and created a small test suite. Each engine (including Deep Fritz 14 for comparison’s sake) had 4 minutes to solve each problem on my old quad-core Q8300, and each engine had 512mb of RAM and access to Syzygy (5-man) or Nalimov (selected 6-man) tablebases as they preferred. You can see the results at the following link:
or as summarized below:
Deep Fritz 14, curiously enough, solved more problems than did Houdini 4, Komodo 7a/8 or Stockfish 5. None could solve the famous Shirov …Bh3 ending. None could solve the Polugaevsky endgame, which illustrates a horizon-related weakness still endemic among even the best engines. Only Komodo 7a, Komodo 8 and Deep Fritz 14 solved position #2, which I thought was the most purely positional test among the bunch. This test is only anecdotal, and perhaps the engines would have gotten more answers right on faster hardware; nevertheless, I was a little surprised.
Test #2: Jon Dart (author of Arasan) has created a series of test suites to torture his engine and others. I took the first 50 problems from the Arasan Testsuite 17 and ran Houdini 4, the two Komodos, Stockfish 5, Deep Rybka 4.1 and Deep Fritz 14 through their paces. (I would have added Crafty 23.08, installed with Komodo 8, but it kept crashing the GUI when I tried to include it in the test.) Here the engines only received 60 seconds to solve the problem – the same standard Dart uses in his tests of Arasan, albeit with a much faster computer. You can see the results at the following link:
or as summarized below:
Stockfish 5 and Houdini 4 each solved 38/50 problems in the one minute time limit. Komodo 8 solved 30 problems, improving by one over Komodo 7a’s 29 solved problems, and doing so with a faster average solving time. Deep Rybka and Deep Fritz each solved 28 problems correctly. Given the shorter ‘time control’ and the relatively tactical nature (IMHO) of the test set, these results seem representative of the various engines and their characteristics.
So now we have to answer the real question: which engine is best? Which one should you use? Let’s begin by admitting the obvious: for most analytical tasks you throw at an engine, any one of the three would suffice. Most of the other major ‘second-tier’ engines, including Crafty (free to download), Deep Fritz (commercial), Hiarcs (commercial) and Junior (commercial), are also sufficient to analyse the games of amateurs and point out our tactical oversights. If you’re just looking for an engine to blunder-check your games, you have plenty of options.
If, however, you’re using engines for heavy analytical work or on very difficult positions, I think you need to consider buying both Houdini and Komodo and also downloading the open-source Stockfish. Each engine, as discussed above, has relative strengths and weaknesses. The best strategy is to see what each of the engines have to say in their analysis, and then try to draw your own conclusions. Were I forced to decide between Houdini 4 and Komodo 8, I’d probably – at this moment, anyway! – choose Komodo 8, simply because it seems stronger positionally, and its slight comparative tactical disadvantage doesn’t outweigh that positional strength. Both Houdini and Komodo are well worth their purchase price for the serious player and student. Downloading Stockfish should be mandatory!