© 2012 by Paul Hsieh

The following are the results of the CritLoops benchmark. It compares the speeds of multiple programming languages and compilers however, it can be used for benchmarking of different platforms. It is open source, covered by the three clause BSD license.


Compare programming languages, compilers and systems for performance in a fair, objective and open way. It should be noted that this test differs from other efforts in that it concentrates on:


The CritLoops benchmark uses weighted geometric means of subtests within categories. In this way if a language is unable to implement any given subtest, it can still give representative numbers so long as it implements at least one subtest in each category. This is an issue for Python which does not make an translation-time distinction between floating point and integer values in the same way that the other languages do and thus cannot implement a different integer and floating point heapsort, for example.

The test allows for two modes of interpreting the results. One is Standard Performance which reflects the performance of the most obvious renditions of each algorithm. Optimized Performance reflects performance that is possible with additional effort that may lead to less than intuitive source code.


The following are sample results for a Athlon XP2000 @ 1.784Ghz with 100Mhz DDRAM. It should be noted that newer versions of most of these compilers/interpreters exist, and therefore the results are not necessarily reflective of the state of the art.

Standard Performance
Language (Compiler vendor) Relative Rate of Performance

Optimized Performance
Language (Compiler vendor) Relative Rate of Performance

Interpretation of Results

With the important proviso that these are not the most up to date compilers, we can nevertheless get a rough picture about what is going on with these results.
  1. As expected C and C++ score the highest results, but there is some variation even amongst different compilers, and between the C and C++ languages. The difference in C and C++ in the standard results, comes mainly from the fact that C is limited to '\0' terminated strings which are significantly slower than C's std::string. However in the optimized results C is able to recover the lost performance by using a length delimited string implementation. MSVS.NET C++ does particularly well in the standard mode because of its vastly improved STL std::string versus MSVC 6.0 which the Intel compiler is using.

  2. While C# and Java are slower, they only shed about 50% of their potential performance (versus C++), which is the equivalent of 18-months of hardware improvements in Moore's Law terms. One real curiosity, however, is that the performance of C# can be improved by really barbaric optimization methods. The improvements seen in the optimized results come from simulating a 2 dimensional array with a 1 dimensional array, and premultiplying induction variables. The first may be due to an inescapable weakness of the C# language (all multidimensional arrays are "ragged" arrays, rather than being rectilinear) but the second is almost assuredly due to a weaker than necessary compiler. Java did not appear to be affected by similar changes.

  3. Python turned in a surprisingly bad score. Other benchmarks lead me to believe that Python would end up at around 10 times slower than C/C++, however 100 times slower is truly stunning. This is the equivalent of 10 years of hardware improvements in Moore's Law terms. This really puts into perspective the claims of Iron Python to be able to improve Python's performance by up to 3 times -- the performance problems with Python are clearly more fundamental. (That is not to detract from the other good features of this language.)


You can download the source and some sample Win32 executables here.