Two related but different questions:
-
Is there a standard way to represent a Unicode character in Kotlin? The type
Chardoes not represent a Unicode character, but rather a UTF-16 token, which can be either a Unicode character from the BMP (which is only a subset of all Unicode characters) or one part of a UTF-16 surrogate pair (which is not a Unicode character at all). -
Is there a way to query the Unicode properties (such as category) of arbitrary Unicode characters? The extension property
Char.categoryfromkotlin.text, for example, only works forChars, so it does not apply to characters outside the BMP.