Convert a String into right charset or detect String's charset

vhodek · December 15, 2020, 10:07am

I would definitely recommend you to always use UTF-8 or Unicode or something similar that is able to handle also characters missing in other alphabets. Because sooner or later you get into “mostly Latin but with a short Arabic sentence” or similar.

However, the world is not perfect and I was already in a similar situation:

Apache Tika can do a lot of things: CharsetDetector (Apache Tika 1.3 API)
I would recommend, based on my own experience: Google Code Archive - Long-term storage for Google Code Project Hosting.

Topic		Replies	Views
Convert String to ByteArray with different encode using kotlin and java but got error result	3	3343	September 17, 2019
Should Kotlin support strings as Unicode sequences instead of UTF-16?	5	8275	October 5, 2020
Convert String to ByteArray and then back to original String	4	45426	October 31, 2017
Convert Char to Byte gives wrong result Support	5	3363	February 12, 2019
Illegal escape: ''0''	1	3456	April 8, 2015

Convert a String into right charset or detect String's charset

Related topics