Since I can’t use the String functions in Java,
I can’t find any function can deal with String codepoints in Kotlin common.
Any suggestion?
AFAIK there is currently no library that does that. There is an open issue for this here but it’s currently not added to the std-lib since it would drastically increase the size of the library on JS.
Some other relevant issues are:
- https://youtrack.jetbrains.com/issue/KT-23251 Extend Unicode support in Kotlin common
- https://youtrack.jetbrains.com/issue/KT-24908 CodePoint inline class
- https://youtrack.jetbrains.com/issue/KT-30509 String codepoint iteration convenience
-
https://youtrack.jetbrains.com/issue/KT-40289 Make
codePointAt
andCharacter.getType
available for Native - https://youtrack.jetbrains.com/issue/KT-38643 KMP Swift: Expose String.unicodeScalars
Those are the once I know of, might have missed a few.
Right now the best approach is to write your own small wrapper library that calls into platform specific implementations of unicode. On the JVM side you can just use the java api and I’m sure there are libraries out there for JS and/or native (using C)
As @Wasabi375 mentioned, there is no standard library for it. I implemented it myself for the JVM, native and JS in my project. You’re welcome to use that code.
The entry points are defined here: array/charsets.kt at master · lokedhs/array · GitHub
You can find the implementations in the respective modules.
Any updates on related features?
for who may want a code point to String feature, here is my snippet
fun codePointToString(codePoint: Int): String {
if (codePoint < 0xD800 || (codePoint > 0xDFFF && codePoint < 0x10000)) {
return codePoint.toChar().toString()
}
val codePoint = codePoint - 0x010000
val chars = CharArray(2)
chars[0] = (((0b1111_1111_1100_0000_0000 and codePoint) shr 10) + 0xD800).toChar()
chars[1] = ((0b0000_0000_0011_1111_1111 and codePoint) + 0xDC00).toChar()
return chars.concatToString()
}