Embeddable Kotlin Compiler -- long term memory leak

OS: Ubuntu 19.04
JVM: openjdk8
Kotlin: 1.3.70 (also observed on earlier kotlins)
Test running environment: IDEA 2019.3
I’ve observed some odd, long running memory creep in a production application that uses dynamic kotlin “script” to define rules on the fly, and frequent kotlin script compilation is one of the targets that has initially concerned me.

I’ve spent some time lookng for similar memory issues, and in most cases, the prevailing recommendation was to allow for the script engine to go out of scope, thus offering history and compilation artifacts for GC.

The following code, however, demonstrates what seems to me to be an acute problem.

package rules

import org.jetbrains.kotlin.cli.common.repl.KotlinJsr223JvmScriptEngineFactoryBase
import org.jetbrains.kotlin.cli.common.repl.ScriptArgsWithTypes
import org.jetbrains.kotlin.script.jsr223.KotlinJsr223JvmLocalScriptEngine
import org.jetbrains.kotlin.script.jsr223.KotlinStandardJsr223ScriptTemplate
import org.junit.Test
import java.io.File
import java.net.URLClassLoader
import javax.script.Bindings
import javax.script.ScriptContext
import javax.script.ScriptEngine

class RulePerformance {
    private val generalScript = """
        val x = 1
        println("Hello ${'$'}x")
    """.trimIndent()

    @Test
    fun `run rules repeatedly`() {

        while(true) {
            ENGINE_FACTORY.scriptEngine.eval(
                    generalScript
            )

            Thread.sleep(1500)
        }
    }

    companion object {
        private val ENGINE_FACTORY by lazy { ClasspathAwareScriptEngineFactory() }
        private val ENGINE by lazy { ENGINE_FACTORY.scriptEngine }
    }


    class ClasspathAwareScriptEngineFactory : KotlinJsr223JvmScriptEngineFactoryBase() {
        override fun getScriptEngine(): ScriptEngine {
            val searchClassLoader = Thread.currentThread().contextClassLoader as URLClassLoader

            val classpath = searchClassLoader.urLs.map { File(it.file) }

            return KotlinJsr223JvmLocalScriptEngine(
                    this,
                    classpath,
                    KotlinStandardJsr223ScriptTemplate::class.qualifiedName!!,
                    { ctx, types ->
                        ScriptArgsWithTypes(arrayOf(ctx.getBindings(ScriptContext.ENGINE_SCOPE)), types ?: emptyArray())
                    },
                    arrayOf(Bindings::class))
        }
    }
}

Nothing that interesting, just a simple kotlin script with no binding variable inside an endless loop with the creation of a new script engine on each loop.

Interesting Point #1
This is what shows up in “non heap – Memory Pool Code Cache”

this mirrors what I’ve seen in production, a slow, but inexorable creep up in something called code cache. No explicit calls to GC help, although given that this is non heap, this is not surprising.

Interesting Point #2
In production, this memory creep becomes a problem after 10 days or so. Locally, however,
the program eventually crashed after a mere 1.5 hours, slightly unusually…

ERROR: Exception while analyzing expression at (2,1) in /Line_1.kts
org.jetbrains.kotlin.utils.KotlinExceptionWithAttachments: Exception while analyzing expression at (2,1) in /Line_1.kts
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorDispatcher.logOrThrowException(ExpressionTypingVisitorDispatcher.java:234)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorDispatcher.lambda$getTypeInfo$0(ExpressionTypingVisitorDispatcher.java:212)
	at org.jetbrains.kotlin.util.PerformanceCounter.time(PerformanceCounter.kt:91)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorDispatcher.getTypeInfo(ExpressionTypingVisitorDispatcher.java:162)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorDispatcher.getTypeInfo(ExpressionTypingVisitorDispatcher.java:133)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorForStatements.visitExpression(ExpressionTypingVisitorForStatements.java:373)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorForStatements.visitExpression(ExpressionTypingVisitorForStatements.java:62)
	at org.jetbrains.kotlin.psi.KtVisitor.visitReferenceExpression(KtVisitor.java:198)
	at org.jetbrains.kotlin.psi.KtVisitor.visitCallExpression(KtVisitor.java:278)
	at org.jetbrains.kotlin.psi.KtCallExpression.accept(KtCallExpression.java:35)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorDispatcher.lambda$getTypeInfo$0(ExpressionTypingVisitorDispatcher.java:173)
	at org.jetbrains.kotlin.util.PerformanceCounter.time(PerformanceCounter.kt:91)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorDispatcher.getTypeInfo(ExpressionTypingVisitorDispatcher.java:162)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingVisitorDispatcher.getTypeInfo(ExpressionTypingVisitorDispatcher.java:146)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingServices.getTypeInfo(ExpressionTypingServices.java:118)
	at org.jetbrains.kotlin.types.expressions.ExpressionTypingServices.getTypeInfo(ExpressionTypingServices.java:93)
	at org.jetbrains.kotlin.resolve.BodyResolver.resolveAnonymousInitializer(BodyResolver.java:665)
	at org.jetbrains.kotlin.resolve.BodyResolver.resolveAnonymousInitializers(BodyResolver.java:651)
	at org.jetbrains.kotlin.resolve.BodyResolver.resolveBehaviorDeclarationBodies(BodyResolver.java:120)
	at org.jetbrains.kotlin.resolve.BodyResolver.resolveBodies(BodyResolver.java:243)
	at org.jetbrains.kotlin.resolve.LazyTopDownAnalyzer.analyzeDeclarations(LazyTopDownAnalyzer.kt:225)
	at org.jetbrains.kotlin.resolve.LazyTopDownAnalyzer.analyzeDeclarations$default(LazyTopDownAnalyzer.kt:60)
	at org.jetbrains.kotlin.scripting.repl.ReplCodeAnalyzer.doAnalyze(ReplCodeAnalyzer.kt:109)
	at org.jetbrains.kotlin.scripting.repl.ReplCodeAnalyzer.analyzeReplLine(ReplCodeAnalyzer.kt:93)
	at org.jetbrains.kotlin.scripting.repl.GenericReplCompiler.compile(GenericReplCompiler.kt:74)
	at org.jetbrains.kotlin.cli.common.repl.GenericReplCompilingEvaluatorBase.compileAndEval(GenericReplCompilingEvaluator.kt:38)
	at org.jetbrains.kotlin.cli.common.repl.ReplAtomicEvalAction$DefaultImpls.compileAndEval$default(ReplApi.kt:175)
	at org.jetbrains.kotlin.cli.common.repl.KotlinJsr223JvmScriptEngineBase.compileAndEval(KotlinJsr223JvmScriptEngineBase.kt:61)
	at org.jetbrains.kotlin.cli.common.repl.KotlinJsr223JvmScriptEngineBase.eval(KotlinJsr223JvmScriptEngineBase.kt:31)
	at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264)
	at rules.RulePerformance.run rules repeatedly(RulePerformance.kt:16)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
	at org.jetbrains.kotlin.metadata.ProtoBuf$Type$1.parsePartialFrom(ProtoBuf.java:4977)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Type$1.parsePartialFrom(ProtoBuf.java:4972)
	at org.jetbrains.kotlin.protobuf.CodedInputStream.readMessage(CodedInputStream.java:495)
	at org.jetbrains.kotlin.metadata.ProtoBuf$ValueParameter.<init>(ProtoBuf.java:18110)
	at org.jetbrains.kotlin.metadata.ProtoBuf$ValueParameter.<init>(ProtoBuf.java:18047)
	at org.jetbrains.kotlin.metadata.ProtoBuf$ValueParameter$1.parsePartialFrom(ProtoBuf.java:18165)
	at org.jetbrains.kotlin.metadata.ProtoBuf$ValueParameter$1.parsePartialFrom(ProtoBuf.java:18160)
	at org.jetbrains.kotlin.protobuf.CodedInputStream.readMessage(CodedInputStream.java:495)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Function.<init>(ProtoBuf.java:14410)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Function.<init>(ProtoBuf.java:14313)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Function$1.parsePartialFrom(ProtoBuf.java:14508)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Function$1.parsePartialFrom(ProtoBuf.java:14503)
	at org.jetbrains.kotlin.protobuf.CodedInputStream.readMessage(CodedInputStream.java:495)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Package.<init>(ProtoBuf.java:11611)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Package.<init>(ProtoBuf.java:11558)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Package$1.parsePartialFrom(ProtoBuf.java:11689)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Package$1.parsePartialFrom(ProtoBuf.java:11684)
	at org.jetbrains.kotlin.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:192)
	at org.jetbrains.kotlin.protobuf.AbstractParser.parseFrom(AbstractParser.java:209)
	at org.jetbrains.kotlin.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
	at org.jetbrains.kotlin.metadata.ProtoBuf$Package.parseFrom(ProtoBuf.java:11972)
	at org.jetbrains.kotlin.metadata.jvm.deserialization.JvmProtoBufUtil.readPackageDataFrom(JvmProtoBufUtil.kt:40)
	at org.jetbrains.kotlin.metadata.jvm.deserialization.JvmProtoBufUtil.readPackageDataFrom(JvmProtoBufUtil.kt:35)
	at org.jetbrains.kotlin.load.kotlin.DeserializedDescriptorResolver.createKotlinPackagePartScope(DeserializedDescriptorResolver.kt:63)
	at org.jetbrains.kotlin.load.java.lazy.descriptors.JvmPackageScope$kotlinScopes$2.invoke(JvmPackageScope.kt:45)
	at org.jetbrains.kotlin.load.java.lazy.descriptors.JvmPackageScope$kotlinScopes$2.invoke(JvmPackageScope.kt:36)
	at org.jetbrains.kotlin.storage.LockBasedStorageManager$LockBasedLazyValue.invoke(LockBasedStorageManager.java:346)
	at org.jetbrains.kotlin.storage.LockBasedStorageManager$LockBasedNotNullLazyValue.invoke(LockBasedStorageManager.java:402)
	at org.jetbrains.kotlin.storage.StorageKt.getValue(storage.kt:42)
	at org.jetbrains.kotlin.load.java.lazy.descriptors.JvmPackageScope.getKotlinScopes(JvmPackageScope.kt)
	at org.jetbrains.kotlin.load.java.lazy.descriptors.JvmPackageScope.getClassifierNames(JvmPackageScope.kt:81)
	at org.jetbrains.kotlin.resolve.scopes.MemberScopeKt.flatMapClassifierNamesOrNull(MemberScope.kt:62)

This bothers me, and makes me worry about the viability of building any application around the regular invocation of the embeddable compiler. Any help would be gratefully received.

I believe that what you see here is actual JVM code cache. Its handling is very implementation-specific, and not necessarily under user control. (Although there are some options, e.g. - https://docs.oracle.com/javase/8/embedded/develop-apps-platforms/codecache.htm#A1100181)
So, the scripting engine actually compiles every script into the bytecode, and then it is loaded to the JVM, it is cached in this cache.
So, you can try to finetune this cache behavior, if your particular JDK provides sufficient control.
Or alternatively, you can try to avoid unnecessary compilation, if enough of your scripts are the same. For the latter you’ll need to switch from the JSR-223 to the new scripting API, and use script compilation caching from there.

1 Like

Hi Ilya…thank you for your informative reply. Just a couple of follow up questions:

  • are instances of KotlinJsr223JvmLocalScriptEngine thread safe? I have found that if I only use one, I don’t get an out of memory error (though memory does still spike all over the place.

  • Is the new scripting api already available? I am using kotlin 1.3.70, but am curious about any documentation out there.

EDIT:
It may also be worth noting…assuming that the repeated compilation of the script is indeed slowly filling up the non heap code cache, my experimentation seems ot suggest that the repeated creation of the ScriptEngine causes that leak to occur even more quickly. I don’t entirely understand this, but using a single engine that is created outside the loop definitely has a positive impact on memory.
If the Engine is not thread safe, then this a definite architectural concern.

The JSR223 script engine is stateful and not thread safe. (We should probably make it synchronized at all important points, so it could be used from multiple threads, but it will not make it lighter, because of statefulness.)
The state there is the state of the compilation and evaluation, which includes loaded dependencies and analyzed code, and many other data required for the compilation and evaluation of the next script that you can pass to the engine. Because the engine is in fact a REPL, so every next compilation and evaluation performed in the context of all previously compiled scripts.
So, keeping one engine running will increase amount of memory occupied by the state with every compilation.
On the other hand, the initialization of the state is expensive, so you may get the tradeoff between the speed of compilation and the memory consumption. You can probably balance between the extremes by destroying the old and creating the new engine only sometimes.
With the new scripting API you have more control of the state, although the general problems remain the same. But you may also use compiled scripts caching interface to avoid compilation when possible. This one is not available in the JSR-223 interface.
The new API is available since some releases already, but in the experimental state. And will remain experimental for some time. It is quite stable though, we are not going to break it often, if at all, but we’d like to have this freedom for a moment, to be able to fix possible problems easier.
And then documentation is mostly missing now. But there is a new repository with examples, so maybe it will be enough to start - https://github.com/Kotlin/kotlin-script-examples. Some concepts are described in this KEEP (https://github.com/Kotlin/KEEP/blob/master/proposals/scripting-support.md) as well.
BTW, the KotlinJsr223JvmLocalScriptEngine is only an example, and somewhat obsolete. The official basic JSR-223 implementation is distributed in the kotlin-scripting-jsr223 artifact.

Wonderful reply.

I have moved away from the Jsr223 implementation, as I could sense that this was not the most future proof way moving forward, and subsequently discovered the new (if experimental) scripting API. After significant experimentation, I finally discovered how to use it with the compiler switches I needed (Kotlin Script Host - Embedded script compilation), and am proceeding with this. Is it desirable or possible that I can submit additonal example to the example git repo? Some of what I’ve discovered may be of use, if it is considered sufficiently idiomatic.

I am intrigued by the caching api you mentioned. My scripts will not change often, and til now I’ve been using my own cache to preserve the outcome of compilation. Do you know if there is an example that covers the use of script cache?

Cool! Sorry that I left you without an answer for such a long time, that you managed to find all the links yourself.
Btw, this forum not the fastest place to find answers, the notification service is not ideal. You can try the public Kotlin slack #scripting channel as an alternative.

Is it desirable or possible that I can submit additonal example to the example git repo?

Yes, please! I will very much appreciate this!

About caching - you can have a look at this part, for example - https://github.com/Kotlin/kotlin-script-examples/blob/master/jvm/simple-main-kts/simple-main-kts/src/main/kotlin/org/jetbrains/kotlin/script/examples/simpleMainKts/scriptDef.kt#L63

public Kotlin slack #scripting channel as an alternative.

I didn’t realise this even existed! I had tried the general channel without much success. Thanks again for your follow up.