Trouble when interpreting Kotlin specifications

Hello all, I’m having trouble figuring out how the Kotlin specification’ Syntax and grammar should be interpreted.

For instance, when reading the section on while-loops, quote:

'while'
{NL}
'('
expression
')'
{NL}
(controlStructureBody | ';')

Here we could see that while is followed by repetition of ({}) new lines (NL)

This immediate raises multiple questions:

  1. Does the repetition rule imply optional use? i.e. is it “one or more”, or “zero or more”?
    • From the behavior of the Kotlin compilers, I could presume it is “zero or more”, but I couldn’t find where this is stated in the specifications
  2. There is no “WS” between 'while' and '(', and the definition of “NL” does not contain WS. Can I claim that code such as while (true), is not valid Kotlin, because there is a space character between “while” and “(true)” ?
    • From the behavior of the Kotlin compilers, I could presume that “infinite” amount of white spaces can be added between rules, but I couldn’t find where this is stated in the specifications

 

What’s more is that I could find examples where the code should compile according to the specifications, but it just doesn’t work on official compilers like the Kotlin Playground.

Please take a look at this: There is no total freedom for spacing in Kotlin
The when block in the second code example seems perfectly valid according to the specifications.
However the Kotlin playground cannot compile it:

 

Thanks for reading and I appreciate any comments!

From my modest understanding, the whenEntry specification ends with [semi] which requires a ; or a NL. Your example doesn’t have either of those.

1 Like

Square bracket is the “optional rule”, if I am not mistaken “optional” should indicate “zero or one”, no?
Besides if it requires a ; or a NL, I suppose it should have been indicated as (semi|NL).
I am super unsure about all of this :thinking:

If I may, what problem are you trying to solve? Are you writing a Kotlin parser?

I haven’t looked into the actual Kotlin compiler, but here is my take as a fellow Kotlin user and software engineer:

Yes. A repetition is “doing this n times”, where n is any natural number. There is NO additional restriction like “n can’t be 0” or “n can’t be 1”, so n can be 0 or 1 or 2 or … . This is also defined by the EBNF specification which is mentioned to be used by Kotlin at the beginning of its specification: Kotlin Grammar Notation

ad 2 (WhiteSpace in while):
I take it that the compiler of Kotlin is composed of multiple parts to interpret the source code, e.g. Lexer, Parser, Semantical analysis, see also: Compiler parts (on stackoverflow)
Note that there are no WS in most rules where you surely would expect a white space, e.g. between “fun” keyword and function name in the function declaration.
The lexer needs to break up the source text into words and my guess is that the additional white spaces are consumed by the lexer. After that, the parser works on the word sequence which the lexer created.

You probably can find some more details about lexer and lexical grammar of Kotlin here: Kotlin Lexer/Grammar

The parser only checks whether the given source words (result from Lexer) conforms to the Kotlin grammar. But not every text that conforms to the grammar is an actual Kotlin program.
The simplest example is the “Unresolved reference” example: If you use a never declared reference in an expression, the source conforms to the grammar, but it is not a valid Kotlin program.
Semantics like these cannot be specified in any grammar (that allows your own names). Therefore, the compiler still has to make a semantic analysis after parsing.

Notably, the error message you receive is actually an “Unresolved reference”. However, I don’t see why it shows that. Semantically, on first glance, it looks fine to me. So it is probably a mistake in the semantical analysis. But like said I am just another user - since I normally like “good formatting”, I haven’t taken an extensive dive into the compiler for “extra-ordinary white spacing”, so I am probably just missing an important point for this particular example.

The semi-rule is actually the choice between semi-colon and new line.

1 Like

Thanks, the lexer explanation makes a lot of sense
And yeah I completely missed the “semi” rule definition… my mistake to assumed it to be ';', don’t even know why I did that :woozy_face: