JIT Optimizations – Method Compilations

JIT  ( Just in Time ) is certainly one of the most interesting features of JVM. This feature makes sure that we are able to run our code with machine level optimizations. JIT in itself does tons and tons of optimizations under the hood which are absolutely necessary for running latency intensive applications.

Impact of JIT

Let’s take this code as an example to study how does JIT affects the performance of our application.

This piece of code follows somewhat volcano design paradigm in which every operator does some task and these operators are bound together and exchange data through a common operator interface and collectively do a bigger task. In this case, these operators are bound together to add a particular number to all the elements in the array.

public static ArrayList<Integer> experimentVirtual(ArrayList<Integer> arrayList) {
    BufferedOperator bufferedOperator = new BufferedOperator();
    AddOperator addOperator = new AddOperator(bufferedOperator, 10);
    SourceOperator sourceOperator = new SourceOperator(addOperator, true);
    return bufferedOperator.arrayList;

See this Github link for more code details.

We ran this code with and without JIT optimizations. There was a huge difference in the throughputs between these two runs.

  • With JIT Disabled we got throughput of around ~3 operations per second
  • With JIT Enabled we got throughout of around ~290 operations per second

So JIT made the code faster by around 100x. So understanding the internal workings of JIT and then asking this question “what can we do to make the life of JIT easier” is the key if you want to improve the performance of your application.

In this series, we will talk about these optimizations in details and we will also learn about debugging our application JIT logs. This particular blog post deals with one of the most important features of JIT which is code compilation.

Code Compilation

This is one of the most important functionalities of JIT. JIT is responsible for compiling Java bytecode to native code instructions at runtime to boost the application performance. JIT figures out the hot code paths via profiling and then compiles those methods into native machine instructions to improve the performance of those hot paths.

How does method compilation happen

Currently, JIT supports these 5 levels of compilation

 *  The system supports 5 execution levels:
 *  * level 0 - interpreter
 *  * level 1 - C1 with full optimization (no profiling)
 *  * level 2 - C1 with invocation and backedge counters
 *  * level 3 - C1 with full profiling (level 2 + MDO)
 *  * level 4 - C2

A Method has to go through some of these compilation phases to reach to the final optimized version of itself. The lifecycle of a method is as follows:

  • All the Methods starts executing firstly in an interpreted mode. During this execution phase, it is found out if a method is hot enough or not. This is found out mostly with the help of method invocations and backedge counters. So if a method crosses a certain threshold of method invocations and/or backedge counters, then it is eligible for compilation at different levels. See this.
  • Now after a method is declared hot, it is now compiled at level 3 by C1 compiler aka client compiler. This compiler does following things:
    • In short time it determines the obvious optimizations that can be done to improve the application performance. This short time is also because of the fact that this compiler is latency sensitive and wants to make sure that the application is in a working state as quickly as possible with obvious optimizations
    • It profiles the methods adequately to make sure that this profiling information can be used by other higher level compilers and they can do more contextual optimizations which would have been otherwise hard.
  • After a method is compiled by C1 compiler, it starts getting executed and starts gathering metrics and based on these metrics it is decided if it needs to be compiled again by C2 compiler.
    • This C2 compiler tries to focus on the best possible optimizations in the method which might affect latencies in the initial duration but would result in higher application throughput eventually.
    • This C2 compiler gathers more metrics for those methods and does more optimizations which are mostly contextual e.g. virtual function inlining. We will explain this later with the help of examples.
  • Apart from this usual flow of method compilation i.e from level 0 -> level 3 -> level 4, there are some other flows in which methods follow a whole different compile path. For reading about those have a look at this link.

Benefits of Method Compilation

Method compilation is one of the most important sauce of performance optimization in modern compilers. C/C++ is fast when compared to legacy java was this simple reason that C/C++ is a compiled language whereas java is an interpreted language. Lets, first of all, understand why is Java Interpreted even when we know that interpreted languages are inherently slow when compared to compiled language.

Java was built on compile once and run everywhere ( on any architecture ) kind of model. This essentially means that source code would be compiled once and this deployable compiled version of the source code would be run anywhere or on any platform. This basically solves the problem of writing code for each and every architecture and then make sure it runs smoothly on all those architectures. But with Java, we just had to write code once and compile into a deployable and use this deployable across all the platforms. This greatly improved the then development phase of the applications.

But with this architecture ( write once and deploy everywhere ) there came a serious problem of non-performant applications. As these deployables were runtime interpreted to the native instructions it made the application damn slow. Then to solve this problem of runtime compilation of the methods JIT came into existence.

With JIT we got the superpower to compile the methods during the runtime of the applications to their native instructions to hugely improve the performance of the application. In some benchmarks with JIT performance of JAVA seems to cross over the performance of C/C++ code. This is mainly because JIT has runtime information with the use of which JIT can do other contextual improvements in the code.

Now Let’s understand how can method compilation affect the performance of an application. We have a performance benchmark in which once we will disable the compilation of some of the methods and compare it with when we haven’t disabled anything.

  • With compilation of some of the methods of the application disabled, we achieved a throughput of around ~ 30 ops/second
  • With compilation enabled for all the methods of the application, we achieved a throughput of around ~600 ops/second

So we can see that compilation of the methods has a huge performance impact on the application. So we need to have a basic idea of the compilation of the methods in our application to know of any potential bottlenecks/improvements.

To know which methods are getting compiled and which is not, you need to add extra JVM flags while starting up your application.

java -XX:+UnlockDiagnosticVMOptions 
     -jar benchmarks.jar

This would prints logs in this format

ts  denotes timestamp
cid denotes compile_id
l   denotes compile_level

ts   cid  l        methodAffected
498  46   3    com.test.experiments.CodeOptimizedBenchmark::<clinit> (46 bytes)
498  46   3    com.test.experiments.CodeOptimizedBenchmark::<clinit> (46 bytes)
531  47   3    com.test.experiments.operators.AddOperator::get (52 bytes)
531  48   3    com.test.experiments.operators.BufferedOperator::get (42 bytes)
538  49   4    com.test.experiments.operators.AddOperator::get (52 bytes)
538  50   4    com.test.experiments.operators.BufferedOperator::get (42 bytes)
540  48   3    com.test.experiments.operators.BufferedOperator::get (42 bytes) made not entrant
541  47   3    com.test.experiments.operators.AddOperator::get (52 bytes) made not entrant
613  51%  3    com.test.experiments.operators.SourceOperator::get @ 2 (79 bytes)
614  52   3    com.test.experiments.operators.SourceOperator::get (79 bytes)
665  49   4    com.test.experiments.operators.AddOperator::get (52 bytes) made not entrant
666  50   4    com.test.experiments.operators.BufferedOperator::get (42 bytes) made not entrant
666  54   3    com.test.experiments.operators.AddOperator::get (52 bytes)
666  53   3    com.test.experiments.operators.BufferedOperator::get (42 bytes)
668  55%  4    com.test.experiments.operators.SourceOperator::get @ 2 (79 bytes)
674  56   4    com.test.experiments.operators.BufferedOperator::get (42 bytes)
674  51%  3    com.test.experiments.operators.SourceOperator::get @ -2 (79 bytes) made not entrant
674  57   4    com.test.experiments.operators.AddOperator::get (52 bytes)
675  53   3    com.test.experiments.operators.BufferedOperator::get (42 bytes) made not entrant
676  54   3    com.test.experiments.operators.AddOperator::get (52 bytes) made not entrant
679  58   4    com.test.experiments.operators.SourceOperator::get (79 bytes)
685  52   3    com.test.experiments.operators.SourceOperator::get (79 bytes) made not entrant

( We are just showing a subset of the compilation logs, there are much many other java or other libraries methods for which compilation happens. For more details about these logs see this link. )

So in the logs, we can clearly see the different methods getting compiled at different times. Different aspects of these logs are as follows:

  • A compiled method goes through different phases e.g. when a method is compiled it is assigned a compile_id and when this method is deoptimized or in other words made non-entrant, then that particular task is also assigned the same compile_id.
  • This compile_id attribute sometimes might contain %. This symbol indicates that the compilation has been done via OSR ( on stack replacement ). This happens when a method call contains a big loop, then in those cases, we don’t wait for the second invocation of the method but instead in the next invocation during next iteration, we replace the code for the method by its compiled version.
  • As already told, methods get compiled and deoptimized often and this deoptimization might happen for a variety of the reasons. This deoptimization is often denoted via made not entrant aside of the method name in the compilation logs. See this link for more details on the various reason for deoptimization.

So now we do understand the performance impact JIT brings to the table and how can compilations of the functions or method to native machine instructions benefit the application performance.

In the next section, we will talk about JIT method inlining optimization and its impact on the application performance.


One thought on “JIT Optimizations – Method Compilations

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s