Recover most and least significant bits in a Byte


#1

Say I have this byte: 1011 1100. Now say I want to split it this way

  • 0000 1011
  • 0000 1100
    And most importantly how do I specify bit by bit a Byte? I could achieve the split in msb and lsb in a breze if i could do something like val msb = '1011 1100' and '1111 0000' ushr 4

#2
import kotlin.experimental.and

fun main() {
    //sampleStart
    val b: Byte = "01001100".toByte(2)

    val msb: Byte = b.toInt().ushr(4).toByte()
    val lsb: Byte = b and 15
    //sampleEnd
    
    println(b.toString(2).padStart(8, '0'))
    println(msb.toString(2).padStart(4, '0'))
    println(lsb.toString(2).padStart(4, '0').padStart(8))
}

Also, your example byte (10111100) cannot be represented as a Byte:

fun main() {
    //sampleStart
    val b: Byte = "10111100".toByte(2) // 👎 Value out of range.
    //sampleEnd
}

You’ll need to use a UByte:

fun main() {
    //sampleStart
    val b: UByte = "10111100".toUByte(2) // 👍 

    val msb: Byte = b.toInt().ushr(4).toByte()
    val lsb: Byte = (b and 15u).toByte()
    //sampleEnd

    println(b.toString(2).padStart(8, '0'))
    println(msb.toString(2).padStart(4, '0'))
    println(lsb.toString(2).padStart(4, '0').padStart(8))
}

#3

Thanks :slight_smile: Btw, the toByte() function truncates the lsbs? Because the docs mentions a rounding as well :thinking:
And why is not possible to represent a byte that starts with 1? I get the sign stuff but why should i use a complete different class to let me set the first bit (I assume for consistency when casting to Int, Long, ecc…)?


#4

That’s quite true. The binary NUMBER 10111100 bin = 188 dec cannot be represented by a Byte, but the data 1011 1100 is a value that a Byte can store, representing the number -68 (which, in this case, we don’t particularly care about since we’re only concerned with the bits).

fun main() {
   val b: Byte = 0b10111100.toByte()
    println(b)
}

#5

The correct terminology for the 4 bit groupings is nibble.

As others have shown you specify a literal bit by bit you use the 0b prefix as in 0b01101101, but in usual Java fashion bytes are signed and just like you can’t say val b: Byte = 255 you will have trouble if that bit for bit literal has bit 7 set and you will have to say .toByte() or use the negative of the 2’s complement. In most byte oriented work where you are dealing with data just as 8 bits at a time you might want to switch to Kotlin’s experimental support for unsigned values.

If I needed this functionality, I would implement it as extension properties:

val Byte.upperNibble get() = (this.toInt() shr 4 and 0b1111).toByte()
val Byte.lowerNibble get() = (this.toInt() and 0b1111).toByte()
val UByte.upperNibble get() = (this.toInt() shr 4 and 0b1111).toUByte()
val UByte.lowerNibble get() = (this.toInt() and 0b1111).toUByte()

#6

That was exactly what i was trying to do! One last question, when calling toInt() or toByte() from an higher byte sized data structure like a Double or Long or someInt.toByte() which bits am I keeping, msbs or lsbs?


#7

See this: https://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.1.3


#8

Actually I was looking to do it in a common module in pure Kotlin. Something like:

fun Long.getBytes() = ByteArray(8).apply {
    var buff = this@getBytes
    repeat(size) {
        this[it] = buff.toByte()
        buff = buff shl 8 // <-- assuming that the previous cast keeps
                          //     the msbs otherwise shr 8
    }
}

#9

First off realize that such an operation is providing a specific ordering to the bytes. Your choice of ordering is not the only valid ordering. In your case, you have chosen little endian ordering. All I am saying is that I would make sure to indicate that it is little endian or LE in the method name and also provide a big endian version as well.

If you were on the Java platform I would steer you toward a java.nio.ByteBuffer for doing this form of manipulation.

Having this function create the byteArray seems a bad idea and inefficient since you might have an array you want to reuse or you might have a big array into which you want to multiple values. Because of this I would make the target the array and code it like this:

fun ByteArray.writeBytesLE(value: Long, offset : Int = 0) : Int {
    assert(this.size - offset >= 8)

    this[0] = value.toByte()
    this[1] = (value shr 8).toByte()
    this[2] = (value shr 16).toByte()
    this[3] = (value shr 24).toByte()
    this[4] = (value shr 32).toByte()
    this[5] = (value shr 40).toByte()
    this[6] = (value shr 48).toByte()
    this[7] = (value shr 56).toByte()

    return offset + 8
}

#10

Shouldn’t you write this[i+offset] instead of just this[i]? Like:

fun ByteArray.writeBytesLE(value: Long, offset : Int = 0) : Int {
    if (this.size - offset >= 8)
        throw IndexOutOfBoundsException("The remaining space is less then 8")
    var buff = value
    repeat(8){
        this[it+offset] = buff.toByte()
        buff = buff shr 8
    }

    return offset + 8
}

#11

Doh, Yes sorry. Unrolling the loop is the most efficient way to code it.