Negative in integral literals

mattquigley · May 22, 2019, 4:37am

I ran into this interesting Kotlin gotcha:

val i = -1.toByte()     // this is an Int
val b = (-1).toByte()   // this is a Byte

After some investigation, it turns out that -1 is actually 1.unaryMinus(), although the final bytecode does transform back into int i = -1 as a Java int. And I understand operator precedence, and that the dot (.) has higher precedence than the unary minus. But my question is why was this design chosen? Why not have the - symbol be a part of the grammar of an integral literal? e.g. in IntegerLiteral Grammar

vbezhenar · May 22, 2019, 7:43am

You can use space between minus and a number: - 1. Some people would write it this way. If you would treat it differently, that would be huge gotcha IMO. And allowing spaces inside integer literal probably would be tricky.

Jonathan.Haas · May 22, 2019, 11:46am

The dot operator has a higher precedence that the minus operator. So -1.toByte() == -(1.toByte()). That’s also why -3.plus(4) is -7 even though it looks like it should be 1. Kotlin has no negative number literals at all.

mattquigley · May 22, 2019, 4:06pm

Integer Literals would not include spaces. - 1 could be considered the unary operator on 1 and - 1.toByte() which is the unary operator on 1.toByte() seems much less of a gotcha than -1.toByte().

mattquigley · May 22, 2019, 4:12pm

I understand Kotlin operator precedence and the grammar of IntegerLiteral.

My question is, why was this design chosen? Why couldn’t a sign be included as part of the grammar? For reference, Java does not include a minus sign as part of Integer Literals either, but does include it in floating-point literals see (search for SignedInteger).

Jonathan.Haas · May 22, 2019, 5:27pm

I guess only the Kotlin devs can answer that. My guess is that they thought the unary operator was sufficient and that complication the grammar wasn’t worth it. You also want to avoid ambiguity, so if -1 could be interpreted as -1 literal as well as a unary minus plus a 1 literal that’s probably bad. I personally feel this could have been prevented easily by giving the unary minus operator a higher precedence than the dot operator. But I’m not a language designer, so I don’t know if that would have had other negative side effects.

ilya.gorbunov · May 22, 2019, 9:44pm

As I recall, the reason was to avoid having different semantics of the expressions
-1.toByte() and -x.toByte(), where val x = 1

isiahmeadows · May 25, 2019, 5:09am

I have a separate question: why does Byte.unaryMinus() return an Int instead of the obvious Byte? Fixing that would fix this issue. I’ve filed KT-31611 in light of this question.

TL;DR for others: This behavior was inherited from Java, but I find Java’s behavior itself to be questionable and arguably buggy.

punkstarman · May 25, 2019, 12:12pm

I believe it is healthy to question established rules from time to time. In order to properly question something, one must attempt to understand the original decision and any reasons that are given to support it.

The reason for unary minus converting a byte to an int isn’t that hard to find: JVM Spec section 2.11.1

If each typed instruction supported all of the Java Virtual Machine’s run-time data types, there would be more instructions than could be represented in a byte.

Kotlin users, remember that most choices are not as clear cut as may appear, many are tradeoffs between performance and simplicity.

ilya.gorbunov · May 27, 2019, 11:40pm

The reason for having unary minus converting a byte to an int was mostly consistency with Java rather than the lack of JVM instructions. It is not a hard constraint to represent an each operation on primitive types with exactly one bytecode instruction.

punkstarman · May 28, 2019, 9:50pm

Obviously the signature of Kotlin’s Byte.unaryMinus() was chosen in order to preserve consistency.

What was in question was why Java chose to convert bytes and shorts to int when performing arithmetic operations. The JVM Spec clearly justifies this by saying that there wouldn’t be enough typed bytecodes and so it was decided to optimize for use of ints more than bytes and shorts.

As to why JVM bytecodes are typed, I couldn’t find anything definitive. A question on StackOverflow has a few insightful answers.

It may not be difficult to specify and implement a virtual machine whose opcodes are untyped, but that was not the direction the designers of the JVM took.

gidds · June 10, 2019, 8:18pm

I guess another benefit of having unaryMinus() return an Int is that it correctly handles the case where the value is -128, and so its negation isn’t representable as a Byte.

Topic		Replies	Views
-1.plus(2) == -3 Language Design	13	3059	February 4, 2017
unaryPlus/Minus and Byte and Short Support	3	2038	September 21, 2017
Strange missing infix operators Language Design	3	1626	August 20, 2018
Unfortunately I'm quitting Kotlin after 4.5 years because the Binary support is just unacceptable Language Design	8	2517	June 3, 2022
Compiler can not parse integer with leading zero Language Design	11	5824	August 27, 2021

Negative in integral literals

Related topics