Can't wrap my mind around this PatternSyntaxException


#1

So, I have this fancy regex (https://regexr.com/429sn – which is where I designed the said regex) which is a-ok, at least for the regex website, but kotlin though chokes on it for a reason that is not clear to me.

Kotlinized approach to the regex being:

"""|"name"\s*:\s*"([^"]+)"\s*,\s*
   |"id"\s*:\s*(\d+)\s*,\s*
   |"id64"\s*:\s*(\d+)\s*,\s*
   |"coords"\s*:\s*{\s*
     |"x"\s*:\s*(-?\d+(?:[.]\d+)?)\s*,\s*
     |"y"\s*:\s*(-?\d+(?:[.]\d+)?)\s*,\s*
     |"z"\s*:\s*(-?\d+(?:[.]\d+)?)\s*}\s*,\s*
   |"coordsLocked"\s*:\s*(true|false)\s*,\s*
   |"date"\s*:\s*"([^"]+)"\s*,\s*
   |"submitted"\s*:\s*\[([^\]]*)\]
|""".trimMargin()
    .replace("""[\n\r]""".toRegex(),"")
    .toRegex()

The error itself looks like this:

Uncaught exception from Kotlin's main: kotlin.text.PatternSyntaxException: Error in "" (-1). 
        at kfun:kotlin.Exception.<init>(kotlin.String?)kotlin.Exception (0x40f696)
        at kfun:kotlin.RuntimeException.<init>(kotlin.String?)kotlin.RuntimeException (0x40f5b6)
        at kfun:kotlin.IllegalArgumentException.<init>(kotlin.String?)kotlin.IllegalArgumentException (0x40f526)
        at kfun:kotlin.text.PatternSyntaxException.<init>(kotlin.String;kotlin.String;kotlin.Int)kotlin.text.PatternSyntaxException (0x44efa5)
        at kfun:kotlin.text.PatternSyntaxException.<init>(kotlin.String;kotlin.String;kotlin.Int;kotlin.Int;kotlin.native.internal.DefaultConstructorMarker)kotlin.text.PatternSyntaxException (0x44ed52)
        at kfun:kotlin.text.regex.Lexer.processQuantifier#internal (0x48163c)
        at kfun:kotlin.text.regex.Lexer.processInPatternMode#internal (0x4543f3)
        at kfun:kotlin.text.regex.Lexer.movePointer#internal (0x453bc9)
        at kfun:kotlin.text.regex.Lexer.next()ValueType (0x4502f2)
        at kfun:kotlin.text.regex.Pattern.processTerminal#internal (0x489588)
        at kfun:kotlin.text.regex.Pattern.processSubExpression#internal (0x4507c3)
        at kfun:kotlin.text.regex.Pattern.processSubExpression#internal (0x450938)
/opt/buildAgent/work/4d622a065c544371/runtime/src/main/cpp/Memory.cpp:1125: runtime assert: Memory leaks found
        at kfun:kotlin.text.regex.Pattern.processSubExpression#internal (0x450938)

the last line up there repeats couple dozen or so times…

        at kfun:kotlin.text.regex.Pattern.processSubExpression#internal (0x450938)
        at kfun:kotlin.text.regex.Pattern.processExpression#internal (0x44e150)
        at kfun:kotlin.text.regex.Pattern.<init>(kotlin.String;kotlin.Int)kotlin.text.regex.Pattern (0x44d55b)
        at kfun:kotlin.text.regex.Pattern.<init>(kotlin.String;kotlin.Int;kotlin.Int;kotlin.native.internal.DefaultConstructorMarker)kotlin.text.regex.Pattern (0x44bafc)
        at kfun:kotlin.text.Regex.<init>(kotlin.String)kotlin.text.Regex (0x44ba1d)
        at kfun:anikaiful.unidb.UniDB.Companion.<init>()anikaiful.unidb.UniDB.Companion (0x40b44c)
        at InitSharedInstance (0x4acdd4)
        at kfun:anikaiful.unidb.UniDB.<init>(kotlin.String)anikaiful.unidb.UniDB (0x407a43)
        at kfun:main(kotlin.Array<kotlin.String>) (0x407597)
        at EntryPointSelector (0x407036)
        at Konan_start (0x406fa7)
        at Konan_run_start (0x406f23)
        at Konan_main (0x406e97)
        at __libc_start_main (0x7f2ecbfebb97)
        at  (0x404d5a)
        at  ((nil))

I tried wrapping the regex construct in try…catch, but the exception message "Error in "" (-1)" didn’t really shine a light at what’s going on any more than reading that wall of text w/o try…catch.


#2

It is a known issue with exceptions in Kotlin/Native regexes. I’ll take a look at your issue, looks like there is a bug in the regex implementation.


#3

Try your RegExp on javascript and implement it on Kotlin. Javascript is descriptive on it’s error message’s.


#4

@samson.ayalew.et Javascript has no qualms with my regex at all, only kotlin has.


#5

Here’s a runnable version of the code. (Kotlin 1.3, running on the JVM)

fun main() {
    """|"name"\s*:\s*"([^"]+)"\s*,\s*
   |"id"\s*:\s*(\d+)\s*,\s*
   |"id64"\s*:\s*(\d+)\s*,\s*
   |"coords"\s*:\s*{\s*
     |"x"\s*:\s*(-?\d+(?:[.]\d+)?)\s*,\s*
     |"y"\s*:\s*(-?\d+(?:[.]\d+)?)\s*,\s*
     |"z"\s*:\s*(-?\d+(?:[.]\d+)?)\s*}\s*,\s*
   |"coordsLocked"\s*:\s*(true|false)\s*,\s*
   |"date"\s*:\s*"([^"]+)"\s*,\s*
   |"submitted"\s*:\s*\[([^\]]*)\]
|""".trimMargin()
    .replace("""[\n\r]""".toRegex(),"")
    .toRegex()
}

#6

\s*{ seems to be the cause. Apparently the engine thinks the { would be part of the repetition op \s*. Escaping the { sorted that out “Java requires literal opening braces to be escaped” says http://www.regular-expressions.info … But oh boy to dig that esoteric info took a while x_x


#7

When creating regular expressions for Kotlin (and Java) I usually check the Java Pattern Javadoc.

Also to check if the regex works correctly you can use IntelliJ built-in Check RegExp, which will also highlight any error. To access it just press Alt+Enter (Show Intention Actions), while the cursor is on a regex, and select it.

image

image

This tips may help you in the future.


#8

@SackCastellon Ah, that page certainly is useful. What comes to code tho, I use Sublime Text (doing typing on one computer, and code compilation/execution on another). IntelliJ’s nice, but in general I need just text editor with syntax highlight.