Kotlin(-Native) as a Go alternative in 2021

Hi there,

I have recently read a thread on this forum from 2019 comparing Kotlin/Native to Go and I was wondering if anything changed since then.

At least the runtime Go seems to be superior to Kotlin/JVM (and Java in general) in both resource usage and performance. In practice this means a cut in infrastructure cost of up to 70%. In return Kotlin/JVM has many language features and a huge ecosystem (thanks to it’s Java compatibility). But will this be enough to remain competitive, especially in backend development?

It looks like most of Go’s advantages stem from the fact that it’s a native language, thus for me it seems only logical to compete with it by also going native. However I haven’t seen much progress with Kotlin/Native in the last few years. Why do you think this is? Does Jetbrains simply not care about this part of the market? Or do they have different plans? Am I missing something here?

I have also seen people say that Kotlin/Native (or native languages in general) just can’t be as fast as Kotlin/JVM. But how can this be when a language like Go already is?

After all Kotlin remains to be one of my favorite languages out there and I’d really like to see it become more popular. But maybe I am mistaken and it already is on the path of doing so? I would really like to hear your opinions on this topic.

First of all, you need to remember that most statements about performance are a lie. At least without additional details. Your statement about 70% is definitely a lie save for a limited number of very specific cases (super-small console applications).

Go is in general less optimized than JVM in throughput and definitely in resource usage. It wins in two positions: minimum memory footprint (JVM pre-allocates about 40 Mb for VM and utilities and you can’t avoid that) and start time (JVM starts in interpreter mode and requires a few seconds to run JIT).

So the answer is the following. In general, your initial assumption is wrong. Any relatively large relatively long-running application will be more effective in JVM. In cases, where you need something very small or short, yes, Kotlin-Native is a very good candidate. The performance will be similar to Go though slower (in general) than Kotlin-JVM. But start time and memory footprint will be smaller. Kotlin-Native also supports coroutines, which makes one of Go killer features (go-routines) not so killer anymore. It lacks some critical libraries like HTTP server, but they are working on it. I think, after new memory model release, it would be a very good tool for smaller applications as well.

1 Like

Even if this were true, does it really matter? Large sofware companies spend way more on engineer salaries than in computing costs anyway.

1 Like

As your whole post is based on this assumption, you should probably start by explaining where did you get this information from and/or whether you tried to run some benchmarks by yourself.

I never used Go, but I believe JVM is one of the fastest runtimes ever invented. In most cases its performance is comparable to highly optimized C++ code or supersedes it. It probably won’t be the best for cases like e.g. image processing (lack of pointers and direct memory access) and it usually consumes much more memory than C++, but in general it is pretty good performance-wise.

1 Like

@broot @darksnake
The assumption is based on the benchmarks which I’ve read before writing this post. I have linked some of them below. They show that Go has a slightly better performance in synthetic benchmarks and a better one in application benchmarks (e.g. http api). The memory usage of the jvm is a lot higher throughout basically all benchmarks, even when considering the 40-80 MB required for the JVM.

a) Go vs Java - Which programs are fastest?
b) Go VS Kotlin benchmarks, Which programming language or compiler is faster
c) Server-side I/O: Node vs. PHP vs. Java vs. Go | Toptal
d) Kotlin http4k (via GraalVM Native Image) and Golang | Lambros Petrou

The cost calculation was based on the peak of the benchmarks, thus up to 70%. My way of thinking was the following: If Golang can handle twice the amount of requests using 1/8 the memory, then you’ll need half the cores and 1/8 of the memory. But of course this is only true if the benchmarks are correct. Thus my question, how would you suggest to measure it? Do you know any more accurate benchmarks?

1 Like

Regarding c) and d) they basically compare two different kinds of servers: blocking (Java/Kotlin) and non-blocking (Go). As non-blocking servers were invented mostly for their increased performance, it is no surprise Java/Kotlin lose in this battle.

I would be more interested in comparison with e.g. Ktor or Netty.

a) Hah, I’ve just written a twitter thread about CLBG fail in this particular case: https://twitter.com/noraltavir/status/1470491224650010631?s=20

b) another microbenchmark inspired by CLBG or forked from it. With the same problems. JMH is not used so all those results for JVM are garbage even if one could compare implementations (which one usually can not - see point a ).

c) Have you actually read it? It compares blocking calls with non-blocking calls. And talks about Java even without CF. It does not have anything to do with language anyway.

d) Again, you should read yourself. It clearly shows the same level of performance even with Substrate on multiple requests. The first call could be indeed slower on JVM.

The only benchmark I know which at least somehow fare is TechEmpower Framework Benchmarks. It compares specific implementations, not languages. It is still a lie if you do not know how to read it since some implementations are just optimized for those problems.

2 Likes

I agree it doesn’t make much sense to compare blocking and non-blocking implementations, so I’ve tried my best to create my own benchmark comparing Go (using gin) and Kotlin (using Ktor with Netty).

Both servers implement a /static route which serves a static string and a /file/<name> route which serves the file with the given name from the data directory.

I’ve used bombardier with default settings for the tests. Because the JVM needs some time to get “hot” I’ve done multiple tests on both servers, waiting a few seconds between each of them, and took the best results for both, respectively.

The /static route:

== Kotlin
Statistics        Avg      Stdev        Max
  Reqs/sec    118952.59    9837.47  137048.63
  Latency        1.05ms   748.43us    78.00ms
  Throughput:    19.28MB/s
== Go
Statistics        Avg      Stdev        Max
  Reqs/sec    155828.78   12136.14  184001.84
  Latency      799.99us    97.91us    26.00ms
  Throughput:    30.77MB/s

The /file/<name> route with the test.txt file:

== Kotlin
Statistics        Avg      Stdev        Max
  Reqs/sec     13226.78     849.23   16523.10
  Latency        9.44ms   291.74us    30.00ms
  Throughput:     2.32MB/s
== Go
Statistics        Avg      Stdev        Max
  Reqs/sec     22838.06    1152.13   24500.61
  Latency        5.47ms     1.93ms    95.12ms
  Throughput:     5.92MB/s

I’m not an expert on Ktor nor gin (or benchmarking in general), so please tell me if I made any mistakes here. You can find the source code here.

1 Like

Like I said in my previous post, I agree it doesn’t make sense to compare vastly different implementations. However I wonder, if, like you said, there aren’t many good benchmarks, then how can you be so sure about the JVMs performance compared to native languages? It seems rather counter-intuitive to me that a JIT compiled program can be faster than an already compiled one. I mean a JIT compiler can’t do much more than a normal compiler (except that it allows for multi-platform support), can it?

Regarding the TechEmporer benchmark. What do you mean by “know how to read it”? Can you explain what you’re reading from it? Because to me it doesn’t make the JVM look any better than before.

I am working with high-performance mathematics and comparing not with Go, but with C++ analogs. And in general, JVM gives the same level of performance for the same code if there are no mistakes and much faster if there are. Of course, highly optimized code is usually better because… well… it is optimized. Including hardware optimization.

Your benchmark looks interesting. Something ktor team should look at. But you need to remember that ktor is far from the most performant framework for networking. The fastest ones probably would give the same result as in Go. In this case, I wonder what is the bottleneck. Have you tried to profile it? My guess is that copyTo is not the best way.

And, you can see yourself that it is not 70% at all.

Also, you are using a small file. What will happen with a file size of 100 kB? The default buffer capacity in Ktor pulling is few kB and it is not effective for very small messages.

And that is what I call “how to read”. You can’t create a framework that is good both for large files and small files, for a lot of threads and for one thread, etc. If you are using the framework in non-optimal conditions, you will get non-optimal results. And that is why benchmarks should be always taken with a grain of salt.

Ok, you got me, I’m definitely unprepared for this discussion. I don’t have any evidence supporting my claims. This is just what I believe, what I saw in many benchmarks and also because Java is used in places where the performance is absolutely critical, like for example in High-Frequency Trading (HFT).

I once read a blog Mechanical Sympathy by Martin Thompson who was implementing a HFT platform. HFT is where you literally get cash by squeezing some nanoseconds from the execution time. There were some crazy optimizations, the team disassembled JITed code and analyzed it on a daily basis. Code was optimized for a specific CPU, it was taking into account how the CPU copies the data across its internal cache, etc. The team once reported a bug to Intel, because in some generation of their CPUs the cache didn’t work as expected.

If I remember correctly, the author of this blog said in one of posts that people sometimes ask them why do they use Java and not C++. He explained that it doesn’t really matter from the performance perspective and they just prefer Java. This isn’t at all any proof, but that sounds convincing, at least to me.

Actually, it can do much more. Once again, I’m far from being an expert and I can only guess how exactly JVM optimizations work. But generally speaking, at runtime we have much more information about the application than at compile time. By analyzing the running application JVM could understand its behavior at much higher level than by analyzing its code statically.

One example that is simple to understand, realistic (so most probably happens in practice) and makes a difference regarding the performance is de-virtualizing function calls. Imagine we have an interface and two implementations that are swapped with configuration. My guess is that JVM notices only one implementation is ever instantiated and replaces virtual calls with direct jumps. Correct me if I’m wrong, but I think this is not possible by “classic” compilers.

Also, I’m aware that contrary to what I said about JIT superiority, in most cases Java is actually slower than C++. I’m not sure what is the main cause of this, but my guess would be that it’s because Java is somewhat “heavier”. It disallows many unsafe operations that are possible in C++. There are boundary checks on array accesses, there is the GC, etc. But I believe (once again, I’m not 100% sure) in most cases the difference in speed is not that big and doesn’t really matter.

Is it just a problem of how we use the language(s) then? I mean Go is booming in the “high performance” network sector (ntp servers, dns servers, web servers). Meanwhile JVM languages never took of here and don’t seem to be doing so any time soon. What do you think is the problem here?

I thought about profiling it but I’ve got no idea what’s the proper tooling for doing so and how to use it, especially with coroutines involved. Any suggestions? If not I guess I’ll open an issue at ktor, but I’m not sure if that’s something they care about.

The 70% include to both cpu and memory consumption. Go is miles ahead in the latter one even when ignoring the base memory required for the JVM to start. I was crediting about half (35%) of the savings to it.

I tried the same test using a ~30KB json file now. Doing so didn’t make it better tho. However I’ve tried a few different methods of sending the file and it turns out reading the whole file with plain IO and sending it is by far the fastest (tho still slower than go). Unfortunately that’s also the least viable option for productive use due to it’s high memory consumption.

== Kotlin (using copyTo)
Statistics        Avg      Stdev        Max
  Reqs/sec      7371.38    1055.40   12343.08
  Latency       16.96ms     1.03ms    49.00ms
  Throughput:   204.98MB/s
== Kotlin (using respondFile)
Statistics        Avg      Stdev        Max
  Reqs/sec      6822.90     530.71    9470.35
  Latency       18.31ms     0.97ms    45.95ms
  Throughput:   189.36MB/s
== Kotlin (using respondBytes & readBytes)
Statistics        Avg      Stdev        Max
  Reqs/sec     15052.26     628.46   16040.19
  Latency        8.30ms   266.92us    29.00ms
  Throughput:   417.24MB/s
== Kotlin (using respondOutputStream and inputStream().copy)
Statistics        Avg      Stdev        Max
  Reqs/sec      5748.21    2933.21   12750.57
  Latency       21.72ms    14.11ms   471.24ms
  Throughput:   159.88MB/s
== Go
Statistics        Avg      Stdev        Max
  Reqs/sec     20854.18     836.74   22150.55
  Latency        5.99ms    37.96ms      1.90s
  Throughput:   580.75MB/s

This blog looks pretty interesting, guess I’ll read through it in detail some time. It seems like when the code is optimized then the JIT can do just as much as a normal compiler. But for some reason most common JVM libraries just don’t keep up with it’s common native competitors. Maybe it’s just harder to write efficient JVM code? At least I can’t think of another reason.

Go or Rust are “safe” too so I don’t think that’s a reason. But they are lighter in the sense that you often have far less objects (or structures when talking about functional languages).

Ohh, actually there is even an article on this blog about “de-virtualize optimization” I mentioned: Mechanical Sympathy: Invoke Interface Optimisations It shows how the JVM adapts to the changing number of interface implementations and that it replaces already JITed code with a new one when needed. Also, the author says JVM does not only replace virtual calls with direct calls, but even inlines the function at the call site. Which is only possible if there are very few implementations at runtime.

So my understanding is that ultimately JIT has potential of being more performant than AOT. But I can’t answer why it is not currently. Maybe we need some time, maybe Java has other problems with the performance (almost everything is on heap, etc.) or maybe it requires more care from the developer to not degrade the performance. I don’t know :slight_smile:

edit:
More about this kind of optimization: Inline caching - Wikipedia

Hi darksnake,

I’m evaluating Kotlin and Go for some console applications. I’m not sure if they count as “super small”, though.

  1. Minimal JavaScript bundler, which would process a large bunch of files at once on each call.

  2. Command line frontend to ufw and firewalld, so that user can use the same command on Ubuntu and openSUSE.

What do you think about each of these use case? Go or Kotlin?

Are we talking about Kotlin-JVM or Kotlin-Native. I would take Kotlin-JVM for the first one. It seems to be long-running and there are existing tools like Closure Compiler  |  Google Developers on JVM.

For the second one, I probably would not take Kotlin-JVM because it is not good for short-running command-line utilities. But no difference between Go and Kotlin-Native. Whatever you like best.

I think this is much more important than the performance of the language. Took me about two (or three) decades to understand this in depth.

In 90% of use cases performance simply does not matter. In 5% it depends on the programmer, the remaining 5% is about the language, compiler, native vs interpreted etc.

Probably it is not really in focus and I feel it actually shouldn’t be in focus.

I can buy a new server with 32 cores and 256GB memory for like one month salary of one software engineer. So for me delivering the software in time and in good quality is much more important than having performance advantage.

Also, if you think in cloud and calculate how much monthly seats usually cost, you will realise that a single customer with 20 seats pays the actual server costs in about year.

Sorry, I should have been more detailed.

I was evaluating all three of Kotlin/JVM, Kotlin/Native (in its current beta state) and Go.

  1. For the minimal JavaScript bundler, it sure is long running but it would be called a lot because:

    It would not support watch, so something like onchange calls it every time any of the JavaScript files changes.

    It would not support incredemental compilation, so on each call everything specified are reprocessed, even if no changes are made.

    However, it would be crucial to output the files within 5 seconds of each call. This is feasible because the user would only be specifying a small set of JavaScript files on each call. Something like a single page application with mininal runtime dependencies. The user would not be using React or some other relatively big libraries.

    To solve the problem of the second instance being called while the first instance is still processing, onchange sends a signal for the already running instance(s) (if any) to stop and then starts up the new instance.

    I was planning to use Kotlin/JVM for this task, however the design of this application means it would have a lot of shutdowns and startups. Is Kotlin/JVM still suited or should anything else be used?

  2. Yes, this command line frontend to ufw and firewalld is to be called a lot and short-running in each call.

    However, it would also support a “Terminal UI” mode. In that case, the application is not going to be short-running.

    Should Go or Kotlin/Native still be used for this application, considering the requirements?