Convert a CharArray or COpaquePointer to String without copying data

#1

I have a Java application that takes up a lot of char arrays, byte arrays, and String objects. A lot these object allocations can be saved if I can find a way to create a char array from an arbitrary pointer addresses. Java, as it stands, does not allow you to do this, even with the Unsafe class. Normally operations like this are ill-advised, but these objects are coming from a read-only memory map. So the application will crash if I try to write to them.

I’ve been doing some research on Kotlin and see it has operations for converting pointers to String’s, but it looks like these methods copy the underlying memory. I need a way to convert pointers to Strings without copying the underlying memory. Is this possible?

#2

I think you are confusing 2 different kotlin compile targets. You started by saying that you have a java application. From that I assume that you are about to change from java to kotlin, but you will probably still target the jvm (this is the default). This means that you are still able to use all your java code.

The link you posted however is for kotlin-native. This is a different compilation target with a few different capabilities but also limitations. You might be able to achieve what you described with kotlin native, I’m not sure. I don’t know enough about kotlin native. You won’t however be able to reuse your old java code or use any java libraries whatsoever. So as long as you don’t plan to rewrite your entire program in kotlin (from scratch) I don’t think this will be possible.

#3

I am in the process of redesigning the application, so targeting Kotlin native is a possibility. I need to weigh the pro’s and con’s, though.

If I choose to target native, will it have options for casting pointer’s to String’s without copying the data?

#4

If I understand this correctly, yes.

CPointer<ByteVar>. toKString() 

should convert a pointer to a string without copying it.

Before you decide to switch to kotlin native though, you should consider this carefully. It’s still an experimental target for kotlin. Although I’m pretty sure JetBrains will keep on developing it, there are far fewer libraries for it and you won’t be able to use any java libraries.
Also AFAIK debugging it is not as easy as you might think. IntelliJ does not support native debugging and CLion (the alternative for it) doesn’t have a free version and I’m not sure how good it’s support is. I personally don’t have much experience with kotlin native and my last experiments with it was 1 year ago, so a lot could have changed with the tooling support.

#5

To do this in Java / JNI you will have to look at NIO and hardware buffers (memory mapped IO). That should support memory sharing. This however does not make the data available as a string. You could perhaps create an implementation of CharSequence backed by the memory mapped IO.

If you don’t actually have pointers, but just a lot of “shared” string backing data you may just use charsequence implementations to provide a string API without copying data.

1 Like
#6

CPointer<ByteVar>.toKString() converts from UTF8 to UTF16, so result is a copy.

#7

I’ve already been down that road and it’s essentially a dead-end. Should anyone else go down the same path, here’s what you need to know:

  • Even if you remove all the NIO Buffers-to-String conversions, you still have to deal with the loss of built-in language functionality for Strings (like String switch statements).
  • Even if you accept the fact you no longer have built-in String switch statements, you still have to find all the .equals occurances and replace them with .isEqualTo’s and have to retrain your other developers to not use .equals anymore.
  • Even if you get past that hurdle, you still have to convert those CharSequence’s to strings if you use the built-in String append (the + operator). You have to find all those occurences and replace them with StringBuffer’s.
  • Even if you accept your now-clunky looking code, you STILL have to convert them to String’s because you’re most likely using a bunch of libraries in your project, and the Java ecosystem generally accepts String’s instead of CharSequence’s (and for good reason).