Tips for reducing memory usage

Hi all,

I’m trying to run a couple of Kotlin microservers on an Amazon Lightsail server, and I’m running into the problem that even one application uses up a lot of the limited RAM. The server has 1GB of RAM, and when I run both applications, usually within 24 hours they chew up all the RAM.

I’m looking for any tips people have for:

  1. Modifying my Kotlin code to reduce memory usage
  2. Options to reduce the memory usage of the JVMs
  3. Any other advice

I’ve thought about either trying to find a JDK that specializes in low memory usage, or simply compiling native binaries (either with Kotlin Native, or with another tool that can compile Java code to native machine code). I’ve also wondered if it’s possible to run multiple jar files in one JVM, so that there’s only one JVM’s worth of memory overhead.

I’m also ideally planning a third Kotlin microservice to run on this server… but with the way things are looking right now, I might have to write it using something with a much lower memory footprint (maybe Python is good for low memory usage? idk)

I’ve currently got the max heap constrained to 128mb for both services; last time the server ran out of memory they were each constrained to a 256mb max heap. There are a few other things running on this server that are using a bit of memory as well (MariaDB server, PHP server).

How good is Kotlin Native compared to using a JVM? When I was searching for threads earlier, I saw one that said Kotlin Native is slower compared to JVM; is that still the case?

All suggestions welcome. :slight_smile:

2 Likes

You might try taking a heap dump, and examining it to see what objects are taking up the space. You might find that some are obviously no longer needed, or are taking far more memory than you expect — which would indicate the sort of code changes that might help.

You can trigger a heap dump at any time using tools such as jmap, jcmd, or JVisualVM, or automatically at the point when it runs out of memory by adding the -XX:+HeapDumpOnOutOfMemoryError command-line option, or programmatically using a HotSpotDiagnosticMXBean; you can then use a tool such as jhat to analyse the heap dump.

Bear in mind that the heap dump file can be large, and take some time to write, so ensure you have enough disk space and patience. Analysing it can take some patience, too — but it’s the only way to be sure exactly what’s taking up your memory! I often find that many of the objects are stored deep in arrays and other structures; it helps if you can follow the chain of references back to something you recognise.

1 Like

Right I should have mentioned this in the first post; the applications aren’t doing anything right now. I’ve only just started deploying them, so they’re not being used, they’re just sitting idle. But they’re still chewing up 200MB~ RAM each. Is this just “JVM uses a lot of memory”, or can I do things to reduce it?

For anyone following this thread, I found out pretty quickly that Kotlin/Native is NOT the awesome solution that I thought it would be. :stuck_out_tongue: Kotlin/Native doesn’t work with existing Java libraries, I guess since it’d have to decompile and recompile to native code, or compile the Java bytecode to native code… so since my projects are written using a bunch of Java libraries, Kotlin/Native is not an option for me unless I’m willing to rewrite all my code… which I’m not. If I was going to go to that extreme, I’d actually research which language has the least resource consumption and use it, instead.

So for now, I’m looking at GraalVM, and hoping it’ll be the silver bullet to reduce the resource usage.

Without an heap dump is impossible to understand what heap contains, so please consider the previous suggestion.
What is the error you get? Understanding your problems better can help us.
How much memory the heap uses in container? 25%? 50%? JVM and Java codes allocate non-heap memory, too.

I’m not getting an error. I’m simply finding that the memory usage of the Java applications is too high for the limited RAM available on the server I’m using.

I’m not running the application in any kind of Docker container, if that’s what you mean by container? After clamping the max heap size to 128m, each application is using roughly 20% of the available 1GB of RAM.

Also, I realise now that this probably isn’t an appropriate topic for this forum, so perhaps it should be closed. One of my applications is using Dropwizard, so it’s basically all Java code. The other one is using Ktor, but I’m suspecting that the cause of the high memory usage is really just the JVM, and there’s nothing Kotlin-specific I can do to improve it. :slight_smile:

What do you mean exactly by saying the memory usage is too high? That they consume more than the configured max? That they already consume max even if not doing too much yet? Please note unused RAM is a wasted RAM. Applications and operating systems tend to consume as much RAM as possible, but that doesn’t necessarily mean they can’t run with a smaller memory footprint.

If you set max heap size to 512MB, in many cases you can expect the application will eventually consume 512MB. But if you reduce it to 256MB, it is possible the application will still run correctly consuming 256MB. Or if you implement a new functionality and you expect it requires an additional memory, that doesn’t necessarily mean 512MB will be too low, because it consumed that much already without that new feature.

If you went to 128MB and everything works correctly, then what’s exactly your concern? Maybe I misunderstood something in your post.

Also, 100-200MB is a pretty low for a JVM application. JVM has its initial overhead, so we can’t get very low with it. Using a single JVM for multiple services could probably help, especially if you plan to add more services in the future. I never used GraalVM, but I would definitely look into that direction.

1 Like

When I was running two Java applications with a max heap of 256m each, they consumed all the RAM on the server (there are a few other apps running on the server, such as a MariaDB instance and a PHP server), at which point the server locked up, 100% CPU utilisation. This has happened a few times, that the Java applications consume all available RAM, and then the server locks up with 100% CPU usage. I have to force stop it via the AWS management console. I think this is because Ubuntu is trying to offload some memory to swap file, but there’s no configured swap file. So I think the server locks up trying to use a non-existent swap file.

My concert right now is that with both apps idling with a max heap of 128m, the available RAM on the server is around 90MB, and I have a third application I plan to write and deploy to this server. Initially I was planning to write it in Java, but there’s no way I can get a third JVM running in 90MB of available RAM. So unless I can drastically reduce the memory usage of my currently idle JVM applications, my third application is going to have to be written in a different language that consumes a lot less RAM.

(I’m highlighting the fact that my apps are currently idling because I have no idea how much memory they’ll eat when they actually start serving requests)

I did wonder about whether this would be possible… but is it? Each Jar file has its own entry point… can a single JVM run multiple Java programs simultaneously?

(Again, I think this thread really has nothing to do with Kotlin, and is more of a general Java question, so mods, please close this thread if you think it’s inappropriate)

2 Likes

@Skater901, I believe implementing some metrics or logs would be beneficial in understanding the root cause of the problem. Is it possible that unusual network requests are triggering these CPU spikes? (*)

Based on my understanding, you haven’t encountered any memory errors, and you suspect CPU spikes due to swapping, although the server isn’t utilizing swap space.

While running multiple applications on the same JVM is feasible with certain limitations, it might not be advisable as it can be challenging to allocate resources effectively among services. You might want to explore Apache Tomcat for better management.

Here are my suggestions:

  1. Tuning a server while it’s idle is not productive unless an idle server is specifically required.

  2. Similarly, optimizing for a language’s performance during idle times may not necessarily reduce costs. If the workload is light, consider using AWS Lambda and pay per request. Although AWS Lambda might not be ideal, it can help cut down costs. However, if you anticipate a significant increase in load, remember that CPU resources are costly whereas RAM is relatively inexpensive. JVM utilizes RAM to enhance CPU throughput, which is by design.

  3. Docker (e.g., Docker Compose) could be employed to isolate resources between servers, preventing a single service from monopolizing all resources and potentially leading to a total server failure. This approach can help limit RAM/CPU usage and facilitate easier scaling in the future.

  4. Lastly, you might want to explore using ShenandoahGC instead of G1 to see if it better suits your use case.

*) Regarding unusual network requests causing CPU spikes, we encountered a similar issue with WordPress+MySQL when an attacker inundated the server with numerous HTTP requests.

The memory usage shouldn’t increase over time for a well-developed application. You probably have a memory leak somewhere.

Yes, it is possible through ClassLoader. However, the JVM overhead is around 100MB, and any decent OS should be able to share it between both instances (unless the processes are containerized or in VMs), so it probably wouldn’t make much of a difference.

It will always be the case. The JVM has had three and a half decades worth of research into speed optimizations, Kotlin/Native will never catch up. But that’s not that bad, since the main goal of Kotlin/Native is to run on devices where the JVM cannot run.

In general here are some of the most impactful changes in projects i’ve encountered.

#1 use serialization library that can reuse/not allocate any memory while deserializing or serializing your messages… it will most likely insanely improve your throuput as well… things to not use any random JSON library, dont use json at all… protobuf… what to use? Chronicle Wire, or SBE or capnproto…

#2 avoid frameworks unless you absolutely have to use them… most of the things people use a bloat with bunch of spring/orm and other magic is unnecesary if you really boil things down to their core.

#3 use ZGC or Shanondoah, there is no reason to use any other gc at this point, everything else including G1 is obsolete…

#4 take a look at number of threads in your apps, if you have more than 2-3 x NumberOfCPUCores… you have a big problem… not only threads can use a lot of memory but they add a lot of context switching overhead.

#5 after doing all that and not achieving your goal use some tools to track memory allocation rate and where it comes from. There are quite a few that can help you with that. Lot of free options as well.

2 Likes

As you say, serialisation can be an issue — and it can be well worth investigating.

One of the biggest wins I managed was in a complex app that effectively did its own simple serialisation of a deep, wide object tree, using a method in each class along the lines of:

override fun toString()
    = subobject1 + "|" + subobject2 + "|" + subobject3

(Only much more complex, of course. Actually, this was in Java, but the effect’s identical.)

As you can probably see, that created a lot of temporary objects: each call creates a StringBuilder instance, which is then converted to a String. And the same applies for each of the sub-objects, and all their sub-objects, and so on… The final string could be many KB long, all parts of which had been copied umpteen times by then. And the app sometimes needed to serialise tens of thousands of these per second for sustained periods, on top of everything else it was doing.

You can probably see how I eliminated all of those temporary objects, by rewriting those methods so that instead of returning a String, they append their stuff to a StringBuilder that’s passed in:

fun toString(sb: StringBuilder) {
    subobject1.toString(sb)
    sb.append('|')
    subobject2.toString(sb)
    sb.append('|')
    subobject3.toString(sb)
}

Yes, the code is a lot uglier and more long-winded — but does no object allocations at all. (And in the top-level loop controlling all this, I was able to reuse the same StringBuilder each time.) This fairly simple change had a major impact, greatly improving the app’s handling of peak loads: IIRC, instead of doing a full garbage collection several times each minute, it only needed one every hour or so.

(You can do similar with methods that return a List, too, of course.)

Short-lived temporary objects can be some of the hardest to track down. (Many garbage collectors are tuned to handle them efficiently, so it’s often not worth the bother — but this was a fairly extreme case. Also, it was many years ago, so maybe it wouldn’t have made such an impact with more recent GCs.) I remember poring over heap dumps, but I think I spotted this case just by looking at the code and considering what object allocations it might be doing.

Thats still serialization, except you are serialising in to strings instead of binary… If thats your goal why not use a threadlocal string builder and not break toString() signature? it will be much cleaner and work with any logging you’re doing…

In general a lot of applications spend most of their cpu/memory on logging, in my field we do not do that, a good working application almost never logs anything, but all actions are events that get stored persistently… so we have something much better than logging… but yes logging will be #1 to be honest for normal apps.

@vach The standard toString() function returns a String, so even if you reuse a StringBuilder, every time it’s called it will have to return a newly-created String object (unless you do some caching, which wouldn’t have been suitable here). And the same goes for all of its sub-objects, and all of their sub-objects, and so on. In our case that would still have been a lot of new objects, and a lot of string copying. (Also a lot of ThreadLocals holding StringBuilders, though those would have been reused.)

Whereas the new version doesn’t create any new objects when it’s called, no matter how wide and deep the tree is.

ThreadLocal StringBuilders and similar objects can certainly be very useful in other situations, though. (ThreadLocal is one of those really useful little classes that should be better known!)

(Although the method is still called toString(), it is of course an overload, not overriding or related to the standard Any.toString(). I kept the name because it made sense, but you could of course pick a more meaningful name if there was one. Either way, there’s nothing stopping you also overriding Any.toString() with a human-readable string suitable for logging and debugging.)

Talking of logging, this is of course why many logging frameworks provide a way for you to define a log message in a more memory-friendly way, such as by giving a template and then the values to interpolate in it separately, or by giving a lambda — that way, it may not need to allocate any objects if the message isn’t actually used, and may be able to optimise creating it when it is. As you say, logging can be a major source of temporary objects in many apps, so every little helps!

For what it’s worth, I’ve basically solved this (for now) by compiling native images using GraalVM. :slight_smile: It’s drastically cut down on memory usage. The apps are still idle for now, so I might run into memory issues when I start using them in anger, but I’m definitely pleased with what GraalVM has been able to achieve.

lol how could i forget toString returns a damn string lol
how nice woudl it be if it actually returned something like CharSequence but eh mistakes were made

(How would returning CharSequence help here? It’s simply a common interface that String, StringBuilder, etc. all implement — and it’s read-only. But even if it returned a mutable object such as a StringBuilder, how would that let you handle nested objects without allocating a new string object for each one?)

I use a lot of chronicle utils, like buffers, wire etc… and those implement CharSequence, which means i can log them without creating a String… In this case would mean you can return “yourself” and be treated like String, without creating any new objects. Its perfect that CharSequence its readonly, it means all that logging would be done without forcing you to create actual String.

Problem with String is that its a final class, and its a specific implementation of CharSequence… in 99% cases what you really want is Charsequence… cos you want to output things, or read things in text format… you dont really specifically want string. Maybe they had a good reason to return String (i cant think of a good one right of the bat) but in general its a bad idea to return a specific impl when you really could return the interface, you force everyone into that impl.

Imagine you return LinkedList instead of List or Collection on a super frequently used interface… good idea? hardly.

Regarding nested objects its simple… If your objects purpose is to contain other objects, then you can find a way to use their own CharSequence to implement yours… it has length() method and charAt(i) which is all you need…

at the very least you have options, while with String you are forced into creating objects…
again, if you know a good reason why force everyone to do it please let me know. I’ll think about it maybe i’ll figure it out later.

Do you mean 99% of cases for toString() or programming in general? I would say in 99% of cases we actually need a String, not CharSequence. For toString() I would consider using CharSequence, although I’m still not sure this is the best choice. Anyway, I suspect toString() existed befere CharSequence and then the change would be backward incompatible.

1 Like