I think the Adopt OpenJDK j9 jvm jit messes with Kotlin somehow

I was talking to @fjeremic in his github issue he shared some decompiled bytecode. He seems to have found the issue, have a look.

The OpenJ9 JIT compiler takes as input the Java bytecode of the method to be compiled and it translates the bytecode into what we call “trees”. If you have a look at the second comment you can see I’ve added some options to the java command:

bin/java -Xjit:'limit={com/travisfw/voxelsphere/lwjgl3/Vector3.taxiDistance()F}(count=0,lastOptIndex=5,lastOptSubIndex=1,traceFull,log=log.trace),disableAsynccompilation,enablefastscorchingrecompilation' -Xnoaot -Xmx4G -Xms4G -jar test.jar

The limit option takes a regex as input and it only JIT compiles methods that match the regex. Since in this case we gave it an exact signature, only one method will ever be JIT compiled. This isolates the problem. To find this method I did A/B testing on a list of compiled methods using a binary search to find which method we have to compile in order to exhibit the bug. This is how I originally arrived at that signature.

Moving on after the curly braces specifying the method regex come the parenthesis in which we can apply options to that particular compile. In this case we added a few of them. count=0 says to start compiling this method on the 0th (first) invocation of the method in the JVM interpreter. This just means we will compile it right away. lastOptIndex=5,lastOptSubIndex=1 limits optimizations. There are several hundred optimizations in the OpenJ9 JIT compiler which are executed in sequential order. This option says to stop performing optimizations after the 5th one, and within the 5th optimization stop performing transformations after the 1st one. traceFull,log=log.trace says to generate a JIT trace file on disk by name of log.trace and to trace everything (full).

disableAsynccompilation disables asynchronous compilation, meaning when a compile is requested the JVM will block until the compile is finished, and then it will execute the compied body. This prevents the method being interpreted while the JIT compile is happening in a background thread. If this were to happen we risk the program completing and exiting before we have finished compiling the method. Finally, this particular problem happen at a high optimization level, we have 5: cold, warm, hot, very-hot, scorching. enablefastscorchingrecompilation lowers the thresholds needed to reach a scorching compilation which throws every single optimization we have (even the quite CPU expensive ones) at the method.

All in all this command will generate a log.trace file with very useful information for OpenJ9 JIT developers. The file is textual and you can open it up in any editor. It has multiple sections. One of the first things you will see is the input bytecodes though. It looks like this:

=======>com/travisfw/voxelsphere/lwjgl3/Vector3.taxiDistance()F
<ilgen
        method="com/travisfw/voxelsphere/lwjgl3/Vector3.taxiDistance()F">
   <request> {OrdinaryMethod(<NULL>)} </request>

        +------------- Byte Code Index
        |  +-------------------- OpCode
        |  |                        +------------- First Field
        |  |                        |     +------------- Branch Target
        |  |                        |     |      +------- Const Pool Index
        |  |                        |     |      |    +------------- Constant
        |  |                        |     |      |    |
        V  V                        V     V      V    V

        0, JBaload0getfield
        1, JBgetfield                           13
        4, JBfstore1
        5, JBiconst0
        6, JBistore2
        7, JBfload1
        8, JBinvokestatic          24
       11, JBaload0getfield
       12, JBgetfield                           15
       15, JBfstore1
       16, JBiconst0
       17, JBistore2
       18, JBfload1
       19, JBinvokestatic          24
       22, JBfadd
       23, JBaload0getfield
       24, JBgetfield                           17
       27, JBfstore1
       28, JBiconst0
       29, JBistore2
       30, JBfload1
       31, JBinvokestatic          24
       34, JBfadd
       35, JBreturn1

Then we translate the bytecode into “tree” intermediate language (IL) representation:

n1n       BBStart <block_2>                                                                   [  0x11957580] bci=[-1,0,7] rc=0 vc=0 vn=- li=- udi=- nc=0
n5n       fstore  <auto slot 1>[#368  Auto] [flags 0x5 0x0 ]                                  [  0x119576c0] bci=[-1,4,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n4n         floadi  com/travisfw/voxelsphere/lwjgl3/Vector3.x F[#367  final Shadow +8] [flags 0xa0605 0x0 ]  [  0x11957670] bci=[-1,1,7] rc=1 vc=0 vn=- li=- udi=- nc=1
n3n           aload  this<'this' parm Lcom/travisfw/voxelsphere/lwjgl3/Vector3;>[#366  Parm] [flags 0x40000107 0x0 ] (X!=0 sharedMemory )  [  0x11957620] bci=[-1,0,7] rc=1 vc=0 vn=n7n       istore  <auto slot 2>[#369  Auto] [flags 0x3 0x0 ]                                  [  0x11957760] bci=[-1,6,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n6n         iconst 0                                                                          [  0x11957710] bci=[-1,5,7] rc=1 vc=0 vn=- li=- udi=- nc=0
n10n      ResolveCHK [#307]                                                                   [  0x11957850] bci=[-1,8,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n9n         fcall  java/lang/Math.abs(F)F[#370  unresolved notAccessed static Method] [flags 0x400 0x0 ]  [  0x11957800] bci=[-1,8,7] rc=2 vc=0 vn=- li=- udi=- nc=1
n8n           fload  <auto slot 1>[#368  Auto] [flags 0x5 0x0 ]                               [  0x119577b0] bci=[-1,7,7] rc=1 vc=0 vn=- li=- udi=- nc=0
n13n      fstore  <auto slot 1>[#368  Auto] [flags 0x5 0x0 ]                                  [  0x11957940] bci=[-1,15,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n12n        floadi  com/travisfw/voxelsphere/lwjgl3/Vector3.y F[#371  final Shadow +12] [flags 0xa0605 0x0 ]  [  0x119578f0] bci=[-1,12,7] rc=1 vc=0 vn=- li=- udi=- nc=1
n11n          aload  this<'this' parm Lcom/travisfw/voxelsphere/lwjgl3/Vector3;>[#366  Parm] [flags 0x40000107 0x0 ] (X!=0 sharedMemory )  [  0x119578a0] bci=[-1,11,7] rc=1 vc=0 vn
n15n      istore  <auto slot 2>[#369  Auto] [flags 0x3 0x0 ]                                  [  0x119579e0] bci=[-1,17,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n14n        iconst 0                                                                          [  0x11957990] bci=[-1,16,7] rc=1 vc=0 vn=- li=- udi=- nc=0
n18n      ResolveCHK [#307]                                                                   [  0x11957ad0] bci=[-1,19,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n17n        fcall  java/lang/Math.abs(F)F[#370  unresolved notAccessed static Method] [flags 0x400 0x0 ]  [  0x11957a80] bci=[-1,19,7] rc=2 vc=0 vn=- li=- udi=- nc=1
n16n          fload  <auto slot 1>[#368  Auto] [flags 0x5 0x0 ]                               [  0x11957a30] bci=[-1,18,7] rc=1 vc=0 vn=- li=- udi=- nc=0
n23n      treetop                                                                             [  0x11957c60] bci=[-1,27,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n19n        fadd                                                                              [  0x11957b20] bci=[-1,22,7] rc=2 vc=0 vn=- li=- udi=- nc=2
n9n           ==>fcall
n17n          ==>fcall
n22n      fstore  <auto slot 1>[#368  Auto] [flags 0x5 0x0 ]                                  [  0x11957c10] bci=[-1,27,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n21n        floadi  com/travisfw/voxelsphere/lwjgl3/Vector3.z F[#372  final Shadow +16] [flags 0xa0605 0x0 ]  [  0x11957bc0] bci=[-1,24,7] rc=1 vc=0 vn=- li=- udi=- nc=1
n20n          aload  this<'this' parm Lcom/travisfw/voxelsphere/lwjgl3/Vector3;>[#366  Parm] [flags 0x40000107 0x0 ] (X!=0 sharedMemory )  [  0x11957b70] bci=[-1,23,7] rc=1 vc=0 vn
n25n      istore  <auto slot 2>[#369  Auto] [flags 0x3 0x0 ]                                  [  0x11957d00] bci=[-1,29,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n24n        iconst 0                                                                          [  0x11957cb0] bci=[-1,28,7] rc=1 vc=0 vn=- li=- udi=- nc=0
n28n      ResolveCHK [#307]                                                                   [  0x11957df0] bci=[-1,31,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n27n        fcall  java/lang/Math.abs(F)F[#370  unresolved notAccessed static Method] [flags 0x400 0x0 ]  [  0x11957da0] bci=[-1,31,7] rc=2 vc=0 vn=- li=- udi=- nc=1
n26n          fload  <auto slot 1>[#368  Auto] [flags 0x5 0x0 ]                               [  0x11957d50] bci=[-1,30,7] rc=1 vc=0 vn=- li=- udi=- nc=0
n30n      freturn                                                                             [  0x11957e90] bci=[-1,35,7] rc=0 vc=0 vn=- li=- udi=- nc=1
n29n        fadd                                                                              [  0x11957e40] bci=[-1,34,7] rc=1 vc=0 vn=- li=- udi=- nc=2
n19n          ==>fadd
n27n          ==>fcall
n2n       BBEnd </block_2>                                                                    [  0x119575d0] bci=[-1,35,7] rc=0 vc=0 vn=- li=- udi=- nc=0

These trees then go through the series of optimizations and then get fed into the backend code generator which performs instructions selection, register assignment, and binary encoding, among several other passes.

We use this tree representation to analyze and manipulate the program in the JIT compiler and to carry out optimizations. The trace files we use to see how the method morphs as it transitions from optimization to optimization. This is what was used to identify the problem described in the OpenJ9 issue.

Hope that sheds some light on the debugging tools and techniques we use over at the OpenJ9 project and how JIT compiles within the JVM are debugged when there are problems. Feel free to ask any further questions if you’re curious!

6 Likes

Really nice, thanks a lot for such a detailed explanation.