Negative in integral literals

I ran into this interesting Kotlin gotcha:

val i = -1.toByte()     // this is an Int
val b = (-1).toByte()   // this is a Byte

After some investigation, it turns out that -1 is actually 1.unaryMinus(), although the final bytecode does transform back into int i = -1 as a Java int. And I understand operator precedence, and that the dot (.) has higher precedence than the unary minus. But my question is why was this design chosen? Why not have the - symbol be a part of the grammar of an integral literal? e.g. in IntegerLiteral Grammar

You can use space between minus and a number: - 1. Some people would write it this way. If you would treat it differently, that would be huge gotcha IMO. And allowing spaces inside integer literal probably would be tricky.

The dot operator has a higher precedence that the minus operator. So -1.toByte() == -(1.toByte()). That’s also why -3.plus(4) is -7 even though it looks like it should be 1. Kotlin has no negative number literals at all.

1 Like

Integer Literals would not include spaces. - 1 could be considered the unary operator on 1 and - 1.toByte() which is the unary operator on 1.toByte() seems much less of a gotcha than -1.toByte().

I understand Kotlin operator precedence and the grammar of IntegerLiteral.

My question is, why was this design chosen? Why couldn’t a sign be included as part of the grammar? For reference, Java does not include a minus sign as part of Integer Literals either, but does include it in floating-point literals see (search for SignedInteger).

1 Like

I guess only the Kotlin devs can answer that. My guess is that they thought the unary operator was sufficient and that complication the grammar wasn’t worth it. You also want to avoid ambiguity, so if -1 could be interpreted as -1 literal as well as a unary minus plus a 1 literal that’s probably bad. I personally feel this could have been prevented easily by giving the unary minus operator a higher precedence than the dot operator. But I’m not a language designer, so I don’t know if that would have had other negative side effects.

As I recall, the reason was to avoid having different semantics of the expressions
-1.toByte() and -x.toByte(), where val x = 1

I have a separate question: why does Byte.unaryMinus() return an Int instead of the obvious Byte? Fixing that would fix this issue. I’ve filed KT-31611 in light of this question.

TL;DR for others: This behavior was inherited from Java, but I find Java’s behavior itself to be questionable and arguably buggy.

I believe it is healthy to question established rules from time to time. In order to properly question something, one must attempt to understand the original decision and any reasons that are given to support it.

The reason for unary minus converting a byte to an int isn’t that hard to find: JVM Spec section 2.11.1

If each typed instruction supported all of the Java Virtual Machine’s run-time data types, there would be more instructions than could be represented in a byte.

Kotlin users, remember that most choices are not as clear cut as may appear, many are tradeoffs between performance and simplicity.

1 Like

The reason for having unary minus converting a byte to an int was mostly consistency with Java rather than the lack of JVM instructions. It is not a hard constraint to represent an each operation on primitive types with exactly one bytecode instruction.

Obviously the signature of Kotlin’s Byte.unaryMinus() was chosen in order to preserve consistency.

What was in question was why Java chose to convert bytes and shorts to int when performing arithmetic operations. The JVM Spec clearly justifies this by saying that there wouldn’t be enough typed bytecodes and so it was decided to optimize for use of ints more than bytes and shorts.

As to why JVM bytecodes are typed, I couldn’t find anything definitive. A question on StackOverflow has a few insightful answers.

It may not be difficult to specify and implement a virtual machine whose opcodes are untyped, but that was not the direction the designers of the JVM took.

I guess another benefit of having unaryMinus() return an Int is that it correctly handles the case where the value is -128, and so its negation isn’t representable as a Byte.