Measuring the Performance of Anything In Scala
Why benchmarking?
Coding consciously and the use of code reviews are of great importance when it comes to writing clean and efficient code. Sometimes however, I feel like I need stronger reasons to go one way or another, specially when I don’t really know what is happening under the JVM hood.
A benchmark helps describing the current status of your implementation. It provides awareness, helps deciding if improvement is worth the investment, and finally helps measuring the improvement of your change (if any).
Unfortunately getting such number is not so easy, but fortunately there are good tools to do it correctly. Keep reading if you’re interested.
What is the problem with benchmarking?
With benchmarking we can determine the performance of an algorithm. However getting meaningful data is tricky.
What can be so complicated about it? Just launch the algorithm many times and measure its execution time, and voila!!!
Nope, benchmarking is not so trivial, specially on top of JVM. Scala by itself applies more than 15 phases when compiling trying to optimize the algorithm, and the JVM can also apply very clever optimizations at run time, leading to a fooled conclusion.
For instance, try to explain why the comparative benchmark set org.mauritania.minibenchmark.catalog.IdentityTricky below (the suspiciously even one) yields such unexpected results for these very different algorithms. Found the reason?
And the solution?
The one I recommend is to use JMH, the harness for Java benchmarking that is exploitable from Scala thanks to sbt-jmh.
There is also ScalaMeter that should be taken into account, I haven’t personally used it yet at the moment of writing this post.
How to get started right now?
I’ve set up a project github/scala-benchmark which renders visual reports that GitHub can display via GitHub pages. Help yourself and fork it if you like the idea.
References
Also if you want to know more, I really recommend this read about JMH.