Imagine my surprise when I got this email from a friend of mine this morning:
Date: Fri, 3 Jun 2011 03:20:48 -0700
Subject: Java Tunings
"Note that Jeremy deliberately refused to optimize the code further,
many of the C++ optimizations would apply to the Java version as
Gee, thanks for making me look good, Robert. (Actually, I had approved that language, but I didn't realize anyone would actually read the paper. :) )
Anyway, I was not just being obdurate here; there was a thought process involved. I should probably document it.
The benchmark did not follow the rules of Java benchmarking. At some point, I should write a post about the rules of Java benchmarking (maybe I have? I don't know). Basically, it timed the full execution of the program, including startup time, JIT compilation time, and GC time. This isn't what you are supposed to do in Java or Scala, so I was very uncomfortable with the experimental setup. In 2011, if you care about startup time, then you probably shouldn't be using Java. The program ran very briefly, so the effect was measurable (once I got it to run fast enough).ETA: I am informed that later versions of this benchmark took this into account. Apologies for the confusion.
- It's kind of silly to compare C++ with Java. The paper is about Scala, of course, but the immediate internal reaction at Google was a discussion of how much worse Java was than C++. As it happens, they really do target two different niches. Programmers are more productive in Java (if they know both languages well). It's easier to test Java code. It scales to large code sizes better. The tradeoff is that you are only going to get gcc -O2 level performance, and you have to be careful about garbage collection. In many, many cases, this tradeoff isn't a big deal. A single-threaded performance arms race between the two doesn't make sense. This is the primary reason I stopped trying to make improvements.
- The benchmark did not use Java collections effectively. It was generating millions of unnecessary objects per second, and the runtime was swamped by GC overhead. In addition to the collections-related fixes I made (as described in the paper), the benchmark also used a HashMap<Type, Integer> when it should have just stored a primitive int in Type; it also used a LinkedList where it should have used an ArrayDeque. In fact, after being pressed on the matter by other Googlers (and seeing the response the C++ version got), I did make these changes to the benchmark, which brought the numbers down by another factor of 2-3x. The mail I sent Robert about it got lost in the shuffle (he's a busy guy), but he later said he would update the web site.
It is this last point that made the difference between Java and Scala. They use the same runtime, so the compiled performance is going to be roughly the same. However, the choice of data structure matters! From the perspective of Scala vs. Java performance, the takeaway isn't that "Java is slow", it is that "you can write bad code in any language" (with apologies to Robert, who is one of the smartest guys around, but isn't a Java programmer).
It is also "you can write bad code in any language" that inspired me to make any improvements. Robert was not a C++ newbie, but he was a Java newbie, so I thought that the comparisons that this was inspiring between C++ and Java performance basically made no sense. I didn't make the changes to demonstrate that Java could do better than C++, as I know that's not true. I made them to show that the playing field wasn't quite as uneven as people tend to think, and that algorithmic changes tend to make more difference than language changes.
These observations don't actually contradict the point that Robert was trying to make. In fact, they underline it. He was trying to say that Scala is very newbie-friendly when you want performance, as it balances the two well. That was clearly true in this case: as a newbie in both Java and Scala, Robert chose more performance-friendly data structures in Scala than in Java. In fact, in large measure, Scala chose the data structures for him, which is probably even better.
The other languages strike terrible balances in this area. Python and Go aren't on the performance radar at all (Python isn't compiled, and Go is young, yet). C++ isn't newbie friendly at all, ever.
Robert, of course, is a very odd choice for newbie, as he is an experienced compiler writer. Havlak is also an odd choice for newbie program, as it is a relatively sophisticated compiler benchmark. Perhaps the moral turns out to be that newbies who are also experiened compiler writers should look carefully at Scala?