Fatal Exception: java.lang.OutOfMemoryError
Failed to allocate a 33554448 byte allocation with 6291456 free bytes and 18MB until OOM, target footprint 37003824, growth limit 50331648
json canât be parsed in stream:
kotlinx.serialization.json.internal.JsonDecodingException: Unexpected JSON token at offset 8046: Expected quotation mark â"â, but had âeâ instead
As I see, decodeFromStream is a native method for Json. So, how to use it properly?
I donât see how using an InputStream would be any better here. How big is the JSON? If it is really big, like e.g. hundreds of megabytes, then you may need to process one item at a time.
My appâs minSDK is 21, so a lot of low-end Android devices could be used. The error I got is from an autotest with this device params:
RAM free: 1.21 GB
Disk free: 630.43 MB
And my biggest JSON is about 30Mb as for now. Is it possible to process it partially without splitting into smaller files? What about other formats like YAML etc.?
Yes, it is possible, but this is usually less convenient to do and requires more work than simply mapping to an object. You need to look for a âstreaming parserâ, for example jackson supports this: https://www.baeldung.com/jackson-streaming-api . Maybe it would be possible to stream subsequent json objects one at a time while mapping them to a class - but I donât know, I never tried that. Also, it would probably make sense to not store the JSON string itself in the memory, but parse it while reading from a file or network.
Before trying the streaming parser, you could also try to parse to a native JsonArray/JsonObject or something similar - not to your own class. Maybe it is lighter on memory, e.g. it only keep pointers to the data in the json string.
Does it mean Kotlinâs serialization has a lack of such functionality? What does decodeFromStream do then? My file is a compressed json, so I need to decompress it first - thatâs why it should be stored in memory anyway. Yes, itâs possible to save it into a temp file and free some memory before deserialize, but does it worth it?
Since youâre producing a list, Iâm assuming that the JSON youâre parsing is an array. If so, I believe Json.decodeBufferedSourceToSequence is what youâre looking for (youâd have to use a minimal amount of okio to create a buffered source, but thatâs trivial). From the docs:
Transforms the given source into lazily deserialized sequence of elements of type T using UTF-8 encoding and deserializer. Unlike decodeFromBufferedSource, source is allowed to have more than one element, separated as format declares.
Elements must all be of type T. Elements are parsed lazily when resulting Sequence is evaluated. Resulting sequence is tied to the stream and can be evaluated only once.
I suppose they meant to look into the documentation of okio and figure it out yourself. Come on, it takes 5 minutes of reading to find out how to do it
And okio works with both network and files, so it should probably fit here.
How is it compressed? You can decompress zip and gzip formats while streaming (using ZipInputStream and GzipInputStream), without storing either the compressed or decompressed data fully in memory.
Yes, I looked into the manual but was not sure if itâs not about HTTP request because of the okio lib. Kotlinz serialization have decodeToSequence - is it useless here?
I use Inflater/Deflater. The problem is not with the decompression but with deserialization of the final json string. I tried to deserialize it buffered like in the first post but the problem is - JSON format became broken when you subtract a random amount of chars as a buffer.
Currently I changed the input from String to ByteArray, maybe itâll provide a little less memory consumption.
Ahh, right, as the kotlinx.serialization has a similar decodeToSequence function, then I guess this is the way to go - we donât need okio.
Regarding the compression. You canât just cut the data into random pieces and hope it will work. You need to decompress by streaming. I believe Inflater canât do that, you need to use InflaterInputStream or other similar utils (as mentioned by @gidds), depending of your compression format.
I didnât notice before that decodeToSequence existed. Thatâs the way to go then, since itâs clear that it uses the same mechanism that decodeBufferedSourceToSequence uses. Of course, if you ever wanna go multiplatform, okio would be the way to go. Okio is just a general IO library, and so it has nothing to do with HTTP or networking.
I already do that. But still itâs not possible to decode a random chunk. I thought, maybe decodeToSequence / decodeBufferedSourceToSequence can do it?
val output = ByteArrayOutputStream()
val buffer = ByteArray(1024)
while (!inflater.finished()) {
val count = inflater.inflate(buffer)
output.write(buffer, 0, count)
}
inflater.end()
return output
Sorry, I think Iâm incompatible with you I tell you you canât pick random chunks and you canât use Inflater, but instead you should stream using InflaterInputStream. You say you already do this, and you show a sample code where you⊠use Inflater and you say something about picking random chunks.
Also, if I read the above code correctly, this is not really processing chunk by chunk - it decompresses the whole file to the memory before deserializing it.
It provides a sequence of items. You can process them one by one and it will read the data straight from the file, decompressing on-the-fly while you consume items. It should keep a very low memory profile, assuming you donât accumulate these items in the memory, and assuming there is a big number of small items and not a small number of huge items.
For this to work the file has to be just JSON compressed with deflate algorithm - it wonât work with zip, gzip, etc. I mention this, because storing deflated files isnât an usual way to store data.
Of course, the above example is just an example - you should close streams, etc.
Alternatively, you can read the data in chunks, but you would have to design some kind of a chunked file format and then chunk properly while writing the file.
Didnât remember it has InflaterInputStream. But found I also used this before:
fun decompress(bytes: ByteArray): ByteArrayOutputStream {
val os = ByteArrayOutputStream()
InflaterOutputStream(os).use { it.write(bytes) }
return os
}
Anyway, it still canât parse on the fly:
@OptIn(ExperimentalSerializationApi::class)
suspend inline fun <reified T> getItems(filename: String, dir: String): Sequence<List<T>>? {
val file = getItemsFile(filename, dir)
return file?.let { json.decodeToSequence<List<T>>(decompressStreamed(it)) }
}
fun decompressStreamed(file: File) = InflaterInputStream(FileInputStream(file))
kotlinx.serialization.json.internal.JsonDecodingException: Unexpected JSON token at offset 0: Expected start of the array '[', but had '[' instead at path: $
JSON input: [{"id":"XveARg0A","name":"Dijo.....
The method should work somehow, otherwise it was not implemented, right? Maybe I need to change the sequence usage? Currently I do simple conversion:
According to the JSON we see, you donât have a list of lists, but just a list. So you should not use Sequence<List<T>>, but just Sequence<T>. Similarly: decodeToSequence<T>.
You canât do toList(). It defeats the purpose of what we are trying to do. We use all these streams and sequences specifically so we donât have to keep a list of all items, because we canât keep them all in the memory at the same time. By using a sequence, you can process them one by one, for example by using forEach(). But again, you canât use forEach() and e.g. add items into a mutable list, because you canât hold them all in the memory.
But a good news is that it looks at least decompression works fine, because we see a correct JSON contents.
Yes, extra List was added. There were no shown problems with decompression after I started to use buffer: val buffer = ByteArray(1024). The problem is in deserialization method. I store all the items in memory after itâs completed multiple times and all is fine.
So, if I canât send items in the list, it looks like Sequence is unnecessary here. It could consume extra resources instead and also I started to get
java.io.IOException: Stream closed
at java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:84)
using it this way:
fun decompressStreamed(file: File) = InflaterInputStream(FileInputStream(file)).use { it }