I’ve been doing a little 3D programming with libGDX and I spend a bunch of time trying to figure out if a parallel operation on the graphics card might have somehow been messing up my logic, but it never made sense. I finally realized that the JIT was eliding some of my math! I switched to the Java.net jvm and it works as-coded. The latest hs and j9 version 15 jvms from Adopt OpenJDK via sdkman seem to have the same problem with the JIT. Has anyone else run into issues with the JIT eliding their kotlin code during extensive loops?
Would you be able to share a small demo that shows the problem?
Not sure. I’ll try to pare it down to something I can post. It may take a day or so.
05F94-VoxelSphere-jit-bug.targz (161.2 KB)
(edit uploaded to github for convenience, too)
Okay, I stripped it down. You can open the attached code in Idea and run it with gradlew lwjgl3:run
If your JVM is any Adopt OpenJDK j9 version, it should show a rotating sphere composed of little cubes, with two big odd shaped holes in it. This is the bug.
Switch to the Oracle Java.net JVM, or the SAP JVM, and you’ll see eight smaller holes in the sphere. This is the correct behavior.
Sometimes, part of the sphere has the correct geometry, and as the loops constructing the geometry iterate through X values (the outer loop moves in the X dimension), it seems that eventually just-in-time compilation elides some of the math, producing different geometry, and the part of the sphere with higher X values looks incorrect.
I can reproduce it. I can’t run openjdk15-j9 as it segfaults on my machine so I ran with openjdk9-j9.
I’m not 100% convinced it is the JIT, as when I disable it the bug is still here.
I tried with “-Djava.compiler=NONE -Xnojit -Xnoaot”
I also tried to compile with OpenJDK14 and run with J9, but it really is something linked to J9 at runtime and doesn’t seem related to the JIT or AOT…
Okay, that’s interesting, but something clearly changes between loop iterations. Is there some kind of optimization other than JIT that happens at runtime?
What I believe is happening is that your conditions that do “continue” in your loop of Modl.kt are getting triggered differently.
I reduced the problem space and logged coordinates (x y z) and in some runs I get entries that are different from other runs.
I’m using the cube [26,32,31] as a control, when it works well, taxiFromCenter is 7.0
but when it fails, it is 1 (so the first condition doesn’t kick in and you end up with those things you didn’t want)
cornerPosition and gridCenter are the same between GOOD_STATE and FAILED_STATE
so the error is coming from here: centeredPos.taxiDistance()
you’ll notice also that in GOOD_STATE there is a little cluster in the center. in FAILED_STATE, the logic that creates that little cluster creates a full bar through the X axis.
package com.travisfw.voxelsphere.lwjgl3
import kotlin.math.abs
data class Vector3(val x: Float, val y: Float, val z: Float) {
operator fun minus(v2: Vector3): Vector3 =Vector3(x-v2.x, y-v2.y, z-v2.z)
fun taxiDistance(): Float = abs(x) + abs(y) + abs(z)
}
fun main(args: Array<String>) {
val gridCenter = Vector3(32.0f, 32.0f, 32.0f)
for (x in 0..320)
for (y in 0..320)
for (z in 0..320) {
val cornerPosition = Vector3(x.toFloat(), y.toFloat(), z.toFloat())
val centeredPos = cornerPosition - gridCenter
val taxiFromCenter = centeredPos.taxiDistance()
val taxiFromCenter2 = Vector3(centeredPos.x, centeredPos.y, centeredPos.z).taxiDistance()
if (taxiFromCenter != taxiFromCenter2) throw Exception("taxiFromcenter are not equal $taxiFromCenter $taxiFromCenter2")
}
println("If you reached here, it worked (and you are likely using a non-J9 openjdk)")
}
I tried with 2D vectors, we have the same problem (but it is even more rare).
I finally got OpenJDK15-j9 to run, and the bug is still there. (This also showed me that J9 is many times slower than OpenJDK by itself with that tiny example).
I tried with my own abs function:
fun abs(value: Float): Float = if(value<0) -value else value
and there is no error…
import java.lang.Math as nativeMath
fun abs(value: Float) = nativeMath.abs(value)
Shows no error as well, which is weird because that’s exactly what kotlin.math does…
But add the inline keyword in front of the nativeMath version, and you get the bug coming back.
Adding inline in front of the pure kotlin version, doesn’t bring the bug either.
Looking at OpenJ9 source code, they have a few optimizations for abs and to determine which methods to use. But that’s way beyond my understanding of the JVM.
Unlikely as we are really far from under/over flow or large exponents. But that may isolate j9 code paths @travisfw can you give a try?
It’ll also be interesting to see if this also fails with plain java.
In any case you should create an issue at the adop opend jdk githbub (I think they use github, not sure).
I did not dig far into it. But what ist the rounding mode and did you try Double vs Float as Float should use only 6 factional digits. Maybe this delimits the cause? https://kotlinlang.org/docs/reference/basic-types.html
This is an occurrence of a bug in the OpenJ9 JIT compiler. I’ve opened up a bug report here and I’ll have a fix shortly:
Apologies for the churn here, and thanks for reporting this to the Adopt community!
Really nice find how did you get access to the bytecode after JIT?
Sorry, I dont know how to see the bytecode after JIT. I just remembered that rounding mode half even is more stable when calculations with graphic coordinates are made and the more fractional digits the better. I was programming some excercises in C in the 90s, where you could see the difference between float vs. double. Hope this idea helps analysing the problem. But I don’t say it’s really related. Just an idea.