ByteStream, BinaryStream | .toBytes()

I was working once with packets and ect and I needed a way to encode my values into a BinaryStream. After some research and frustration I did not manage to find any standard libraries that contain a BinaryStream or something similar. A ByteArray would not be a solution for that, because of performance issues and to much allocating. So I decided to use a list, even though there normally should be better options.

Then I started to think how to convert my Ints, Floats, Longs, and ect to simple byte arrays. Other languages provide this with ease but Kotlin does not have anything like that as default. And making especially for this and API that converts all this is just annoying and really shouldn’t be neccessery.

What I want to see from the Kotlin team:

  • BinaryStream / ByteStream as standard that is fully optimised for that
  • Easy conversion to ByteArray from any native type

And no, I really don’t want to use ANY of those dumb java APIs.

Kotlin stdlib does not provide them, because there is no point of duplicating the Java stdlib which is great and has these utils already (well, actually sometimes it makes sense to duplicate - to be multiplatform). If you don’t like these “dumb java APIs” then go and write your own Kotlin wrappers around these APIs.

But if you would like to use Java utils, then I suggest ByteBuffer or DataOutputStream with ByteArrayOutputStream.

3 Likes

Ahh, you can also try okio: Okio

1 Like

At least I would like to see an .toBytes extension method or at least some standard apis that can handle that

Ahh, yes, I agree. I’m surprised there is no simple e.g. Int.toByteArray() function in both Java and Kotlin stdlib. I often implement it by myself.

1 Like

The problem is that in this language doing it on your own is really annoying (compared to C++ or C#) and really inefficient

Well, if writing a single-line extension function and then using it everywhere would be “really annoying” to me, then I would probably have to stop being a developer :wink: There are things 100 times more annoying than that on a daily basis.

Sounds like you are for some reason forced to use Kotlin and now you are unsatisfied that it is not the language you got used to.

1 Like

No the problem is that this should be out of the box like in any other good languages. I really enjoy Kotlin, I enjoy it’s syntax and features, but those things make it really frustrating. Also as mentioned before, Kotlin sadly cannot do things like converting things to byte array very well compared to C++. In C++ you have pointers and you can reinterpretcast anything while Kotlin doesn’t allow you to do that, which makes making those functions really annoying and inefficient. And if the language tries to prevent this, then it should at least provide an alternative, like standard libraries like C#

Probably because Kotlin isn’t a low-level language as C++ is? Kotlin grew out from the JVM and even if it has support for native, JVM is still its main target.

And speaking about JVM… it does not have pointers, does not have multidimensional arrays, jumps to even static/private functions are indirect, it can’t allocate uninitialized data structures, it can’t put any objects on stack, only on heap, each allocated object takes additional memory for its headers, even simple strings, arrays and e.g. nullable integers aren’t “just data”, but have some headers, all reads/writes to arrays are slowed down by adding a boundary check, etc. And on top of that there is a garbage collector that freezes your whole application from time to time. Don’t you think that inability to just cast int to byte array is really a minor problem comparing to all above? :slight_smile:

Memory representation of any data in Java/Kotlin is considered implementation details - developer should not really care. Converting simple types to their binary form is very rare in these languages. Stdlib still provides support for such operations, but it is placed somewhere in I/O library, not in the front.

P.S. AFAIK, for many, many years Python and JavaScript couldn’t run concurrent code and JavaScript didn’t have a possibility to process any binary data - how bad is this? And still they’re one of the most popular languages in the world. Languages are just tools, they do what they were designed to do.

2 Likes

I know what you mean, but I simply cannot imagine that the JVM does not have any type of pointers. Pointers are fundemental in every single programming language (there are no excuses) and they are everywhere. Even references are pointers and I cannot simply imagine that the JVM is as bad. C# is a good example how a high level language should look like, it can do everything, even as a high level language. And if C# already did that, then other languages also can do. Also please do not come with JavaScript, JavaScript isn’t as a good language and you really can’t compare it to Kotlin, as Kotlin is way way better. But I really hope that Kotlin Native will give a lot more opertunity and that it will become a standard. I read that Kotlin Native is slower, I know that making a new compiler from scratch is hard, but I also know that with such a big company and such good engineers you can do everything, even a native compiler that is way faster than any JVM. JVM can always be beaten, other languages like C# and Dart have showed that, and I really hope that Kotlin on JVM will be alternative instead as a default.

Unfortunately, I’m not familiar enough with C# to compare it to Java in such details. I always considered these languages very similar, because they were designed to serve similar role. Now I see that there are in fact great differences between them. Don’t take my words for granted, but I believe the main priorities for Java were always reliability, being error-proof and multiplatform and many design decisions depended on them.

In Java it is impossible to even intentionally access unallocated memory or accidentally access a different location in memory than intended. It is impossible to leak any memory or mess anything in the way that VM will crash. I may miss something, but I don’t see how you could achieve such requirements while having pointers in the language.

For long time C# ran on a single OS and mostly on a single CPU architecture. Java ran on everything from the very beginning: Windows, Linux, x86, ARM. It ran on mobile phones since around 1998 - long before Android/iOS were created. There are microcontrollers capable of running Java bytecode. Each platform could work much differently, for example JVM passes function parameters on stack, while Android implementation passes them as virtual registers. And still, Java have a requirement: write once, it will just work. For example, I see that until recently the way to convert ints to byte arrays in C# was to use BitConverter. This util is soooo-not-Java-ish. For a Java developer it is ridiculous that some function works differently depending on the CPU architecture.

So, I think that while these two languages are pretty similar, they really differ in priorities and expectations of developers. This difference results in different design choices and removing some features known from other languages. Kotlin is partially dependent on Java design, but even if it wouldn’t I guess it would still not have pointers - similarly to e.g. Python. They are just too error-prone and too low-level. They could be added in the future as a feature specific for targeting native.

2 Likes

If I’m not mistaking, Kotlin team is working on an extension for IO operations. But I think it is not their current priority, as they’re working on a lot of other features too.

You can find a buffer API in it, with some operation to add number values (int, float, etc.) into a buffer : kotlinx-io/PrimitivesOperations.common.kt at master · Kotlin/kotlinx-io · GitHub

I don’t know if this library is stable enough. I think that Ktor contributors could provide a better insight about it (because they do a lot of IO operations).

Maybe I’m wrong, but I’d say that transfer to byte values are more likely to happen in a buffering / streaming context, and in such cases, you’ll need a buffer/stream object anyway. Also, Buffer APIs manage (generally) the endianess configuration.

Side note about pointers:

I’ve learned programming with C/C++. But today, I do not want to see a pointer any more. They’re a low level memory-management that I consider (personal opinion here) too complex and too tricky to allow safe modeling of high-level programs. Java memory model allow to avoid both buffer overflow and invalid memory access. For example, I don’t see how null-safety would work with pointers.

More over, pointers are hardly compatible with a garbage collection approach. Let’s take Java for example:

  1. its garbage collector can move object references (including arrays) at any point in time (from nursery to survivor space for example). User do not have to care about that. As long as you get a reference upon an object, you can continue using your object safely, even if the underlying pointer to the memory area has changed at some time.
  2. If you want pointers in Java, you have to use Direct buffers to model a contiguous byte array, but that’s all. And for these kind of buffers, you lose garbage-collection, so it’s a powerful (for optimisation purpose) but dangerous tool.
4 Likes

I am not saying that pointers are better as normal references, I am just saying that having such options is always better. When working with unsafe code in C#, you need to enable the unsafe code compiler option and you need to do everything in an unsafe scope. This isn’t in many cases really necessary, but sometimes it just makes living much much easier. Pointers would be a great addition to Kotlin, that would give users AN OPTION to use them (in an unsafe scope as like in C#). Also thank you for your for some references, this will probably help me a lot.

Right.

Say what?!

Why on earth do you need conversion to ByteArray when it’s terribly inefficient due to allocation?

I work with binary packets a lot. I really miss a kind of BinaryOutputStream (and a BinaryInputStream), so I created my own classes for that purpose, but I see no point in serializing primitives to byte arrays. My BinaryWriter simply has writeInt(), writeDouble() and so on…

The only kind of toByteArray() function I’d like to see on primitives is one accepting an external buffer, like toByteArray(ByteArray, offset, endianness). Now that would be useful so I don’t have to implement all that stuff myself or delegate to something like DataOutputStream that doesn’t even support endianness.

ByteBuffer already does exactly the same thing, though, and it does support endianness and it can be wrapped around an existing byte array, so it’s really more or less a non-issue.

2 Likes

It’s like string and StringBuilder. When adding many chars to string it’s very inefficient, compared that to a stringbuilder it’s very efficient. In the end you use .toString and the whole stringBuilder thinggy is now a string. That’s what I mean with BinaryStream and ByteArray

By comparing to StringBuilder do you mean that concatenating byte arrays is not efficient? Yes, you are right. But why to do this if you can allocate bigger byte array and just write to it whatever you need? Even if there would be a binary stream, it would most probably use ByteArray internally.

So it’s not a way to directly convert a primitive to a ByteArray, but rather some sort of ByteArrayBuilder?

I usually use a combination of BinaryWriter (my own) and a ByteArrayOutputStream (actually my own wrapper around it) for that. I’m thinking making it a wrapper around FastByteArrayOutputStream (from fastutil) instead, though, because ByteArrayOutputStream is synchronized for whatever crazy reason.

1 Like

Do you even know what StringBuilder is? Do you even know what it is being used for? If not then google it :person_facepalming:

Yes, I know what StringBuilder is. It is basically a wrapper around CharArray that makes easy to construct strings using the technique I just described to you:

allocate bigger byte array and just write to it whatever you need

Why do you ask? ByteStream, BInaryStream, ByteBuilder or whatever you would call it, would do exactly the same, but using ByteArray internally.

ByteArray isn’t inefficient itself. There are efficient and inefficient ways of using it.

edit:
Actually, it seems StringBuilder uses ByteArray internally, not CharArray. It used CharArray in the past.

1 Like