[Idea] Python backend

What do you think about having Kotlin + Python interop, just like we have for JS or Native?

Python is extremely popular, and getting even more popular. It’s now industry standard in machine learning and data science. It’s not about superseding Python with Kotlin, but rather giving Kotlin an alternative for those who want to reuse their Kotlin knowledge and reuse existing Kotlin platform-agnostic logic.

About a year ago I created an issue proposing this idea: https://youtrack.jetbrains.com/issue/KT-34074 - click to see more details about the idea and example use cases. Someone from JetBrains has even commented there, but he doesn’t seem to be speaking on Kotlin team’s behalf, but rather as a potential client.

This idea has been sitting in my head for quite a while, so I’m curious what’s your take on this - if you also see benefits of it, and if someone has already considered it. I’m also wondering if someone from the Kotlin team has performed a business analysis of such interop (explicitly deciding not to go into this direction for now), or it’s implicitly not being looked at.

Best,
Piotr

4 Likes

More backends are a great idea. It’ll definitely have to wait until the compiler overhaul is done.
I believe there’s already an official WebAssembly backend in the works–and of course, the remaining old backends are being rewritten to the new frontend.

As far as other backends, a .NET target has been a popular request. There was some talk about working on (and funding?) a .NET backend by the community. I’m hoping it will be possible for the community to adopt new backends for Kotlin independent of the officially supported targets.

Python would be nice–if the community is able to make some headway on it first I suspect it would really strengthen the odds.

3 Likes

Do you have any insight into when the overhaul is done, and why actually is it needed? I thought about putting together some Python interop PoC as soon as IR can be emitted for common Kotlin (IR → Python, to be used from Python-based entry points), and AFAIU it’s already possible given that some backends already use it experimentally.

I tried analyzing the code base, looking at how it’s done for JS, but I feel a bit overwhelmed :slight_smile: If you have some docs on the internal architecture of the compiler or anything that would be helpful… I wish creating a new backend was as simple as implementing a well-defined compiler API, or at least following well-defined steps described somewhere. Well-documented IR would be the meat of that API, I presume.

Regarding .NET, I thought that it got disbanded pretty quickly, looking at https://youtrack.jetbrains.com/issue/KT-1287. But looking at this forum, there were several topics on this forum indeed.

1 Like

Yeah, it might I bet it would be possible to use the current Kotlin IR and experiment with Python. I have no clue about documentation though–it’s out of my purview.

By all means, I’d love to see community efforts. The reason I said I think it’ll have to wait is that I was assuming the Kotlin team would already be tied up with the other backends. Also I thought the IR was still experimental. If it’s community-driven and people are okay with building on an experimental state then all bets are off.

1 Like

I took a glance how IR → JS translation is done. It’s actually not that hard to grasp. Here are my notes for someone that wants to tackle this (might be me one day :slight_smile: ):

  • the most interesting Gradle module: compiler/ir/backend.js (to see its logic, jump straight to compiler/ir/backend.js/src/org/jetbrains/kotlin/ir/backend/js)
  • the top-level file seems to be compiler.kt, with two functions: compile returning CompilerResult that contains val jsCode: String?, and generateJsCode. IR is the input, JS code is the output
  • going deeper, see transformers/irToJs/IrModuleToJsTransformer.kt and other ...ToJsTransformer.kt files. They seem to be the places which perform the actual mapping from IR entities to JavaScript abstract syntax tree (org.jetbrains.kotlin.js.backend.ast.* entities)
  • the mapping of JS AST to the actual JS code is common for both the old and the new IR backend (see js/js.ast Gradle module). It happens in JsProgram’s toString(), inherited from AbstractNode. The actual implementation sits in JsToStringGenerationVisitor

Next steps: compile the compiler/ir/backend.js module (after a small modification, to be sure that the modified version is used instead of the prod Kotlin compiler) and use it with some example Kotlin code. Once it works, we can gradually experiment with making adjustments towards Python. Maybe even the “IR → JS AST” layer can stay as is for now (I see some similarities between JS and Python and we could take advantage of them), and only “JS AST → JS code” layer can be adjusted to generate Python.

1 Like

Cool stuff! If you get some momentum or a hello-world for Python-- be sure to share on GitHub. I bet a lot of people would be interesting in playing around with any progress or learning for other compiler customizations.

Sure, experimenting here, see latest commits: Commits · krzema12/kotlin-python · GitHub

For

fun pythonTest() {
    println("Hello world")
}

and a command dist/kotlinc/bin/kotlinc-js -output kotlin-python/out.js kotlin-python/python.kt (uses the classic backend, not IR) so far I got something that still isn’t valid Python (not much missing, though):

def (_, Kotlin) :
   println = Kotlin.kotlin.io.println_s8jyv4$
  def pythonTest() :
    println('Hello world')
  
  _.pythonTest = pythonTest
  Kotlin.defineModule('out', _)
  return _

but it’s encouraging to see some progress at least in terms of working with the Kotlin compiler code base :slight_smile: And that creating a custom backend is not rocket science.

This little experiment leads me to a conclusion that it would be better to implement Python’s AST from ground up, to avoid fighting with some of JS peculiarities. Adjusting JS parts to work like Python is IMO not the way to go long-term. I’ll push this experiment just a bit more forward and then will think how to implement the Python backend the proper way.

3 Likes

I decided to start with implementing data structures to express Python AST. From this point, both integration with Kotlin’s IR and generating Python code can go in their own paces. I won’t spam here too much - if anyone’s interested in the progress, see Commits · krzema12/kotlin-python · GitHub, especially the dedicated README with progress. I’ll describe my approach below in case someone comes up with a better idea - any feedback or help appreciated.

The starting point is Python’s grammar from CPython project described using Zephyr ASDL. I want to write a tool that parses ASDL and generates Kotlin’s data classes, sealed classes and whatever else needed. I googled for an existence of such ASDL → Kotlin generator - didn’t find anything. Having such tool, the process of generating code to express the AST becomes data-driven instead of maintaining it by hand. I won’t focus on any particular Python version, I want it to be Python version-agnostic. I’ll test the generator on CPython’s master branch (now 3.9) and some earlier 3.x versions. This milestone is complete once we can take any Python code and express it using Kotlin entities.

4 Likes

That’s very interesting, is there a way to get notified by your progress? I’m following releases on your repo, but that’s probably not the solution. Maybe you could create a channel on the Slack?

I’m not planning to create GitHub releases because I create multiple little commits. I thought that if you watch a given repo, you can mark to get notified about all changes, but not sure if commit made on a non-main branch emit notifications.

I promise to post updates in this thread once I reach major milestones. I’ll take a look later at creating a new Slack channel in Kotlin’s space, it may be indeed a nice place to have side conversation.

I created a #python channel in kotlinlang.slack.com.

2 Likes

It’s been a year since project’s inception, so I thought I’ll give you a short update how it goes.

Two contributors, JetBrains’ support

First and foremost, I’m happy to share that Sergei “SerVB” Bulgakov joined the project! He works for JetBrains, on Projector project, and he decided to devote 20% of his time for Kotlin/Python. Let me put it straight: if not Sergei, the project would be either dead or at most 20% of what it’s now. Every Friday he pushes stuff forward, step by step. Thanks to the fact that Sergei works for JetBrains, it’s also easier for him to get in touch with the Kotlin team. In fact, the Kotlin team has been helping us in a form of advising what patch to take during implementation, hinting how stuff works in other backends, and so on. You can be a witness of this cooperation on #python-contributors Slack channel.

Progress

Does it work good enough to be usable? Well, we’re not there yet. However the progress we’re making is systematic and looks promising. As our main metric how far we are, we assume the number of passing platform-agnostic box tests. Box tests (short for black-box tests) are simply end-to-end tests of the compiler, taking a sample Kotlin code that does something and is expected to return "OK". They are a great help and allow us to develop the new backend in a somehow test-driven way. You can see example box tests in Kotlin’s main repository.

Here’s a visualization made from our project’s git history, as of November 2021:

Box tests history

To see the current plot, click here.

Talk is cheap, show me the code

Basic language features are implemented, but the standard library is almost unusable yet.

Here’s a test that shows that integers can be compiled and work the same as in Kotlin. It also shows basic stuff like calling functions, recursion, when, simple boolean logic, and other.

Kotlin code:

fun isPowerOfTwo(n: Short): Boolean = (n.toInt() and (n - 1)) == 0

fun factorial(n: Int): Long = when (n <= 1) {
    true -> 1
    false -> n * factorial(n - 1)
}

fun numberOfCombinations(n: Int, k: Int): Long = factorial(n) / (factorial(k) * factorial(n - k))

fun sumOverflowDemo(a: Int, b: Int): Int = a + b

compiled to Python (omitting standard library):

# ... 13300 lines of Kotlin stdlib Python code...

def isPowerOfTwo(n):
    return n & (n - 1).__add__(0x8000_0000).__and__(0xffff_ffff).__sub__(0x8000_0000) == 0

def factorial(n):
    tmp0_subject = n <= 1
    if tmp0_subject == True:
        tmp = 1
    elif tmp0_subject == False:
        tmp = n * factorial((n - 1).__add__(0x8000_0000).__and__(0xffff_ffff).__sub__(0x8000_0000))
    else:
        noWhenBranchMatchedException()
    
    return tmp

def numberOfCombinations(n, k):
    return (factorial(n) // (factorial(k) * factorial((n - k).__add__(0x8000_0000).__and__(0xffff_ffff).__sub__(0x8000_0000))).__add__(0x8000_0000_0000_0000).__and__(0xffff_ffff_ffff_ffff).__sub__(0x8000_0000_0000_0000)).__add__(0x8000_0000_0000_0000).__and__(0xffff_ffff_ffff_ffff).__sub__(0x8000_0000_0000_0000)

def sumOverflowDemo(a, b):
    return (a + b).__add__(0x8000_0000).__and__(0xffff_ffff).__sub__(0x8000_0000)

Python consumer:

from compiled import isPowerOfTwo, factorial, numberOfCombinations, sumOverflowDemo

print(isPowerOfTwo(32))
print(isPowerOfTwo(33))

print(factorial(5))

print(numberOfCombinations(4, 3))

print(sumOverflowDemo(2 ** 31 - 1, 2 ** 31 - 10))

Output:

True
False
120
4
-11

To give you a clue what kind of simple things don’t work yet and why, compiling and executing such Kotlin code:

fun test(): List<Int> {                                                 
    return listOf(1, 2, 3)                                              
}

results in a mysterious

NameError: name 'kotlin_Any_' is not defined

It’s because listOf is a part of the standard library where class hierarchies come into play, and apparently something about the base class Any is not yet implemented. It’s a short, basic function, but actually using complex machinery under the hood.

See more end-to-end tests and box tests report to learn more.

If you’re curious what the standard library compiled to Python looks like so far, see here.

Development workflow

Each commit is checked with CI and tested against the box tests. You may ask: how come the test come up green if the backend is not yet ready? Well, we allow the tests to fail and still have green overall CI result, but in return we maintain a file with a list of failed tests together with a file with more details about test results. It does add some complexity to the PRs, but thanks to this we can manage regressions. A typical contribution that covers new features isn’t checked “if the tests still pass” or “do the new tests pass” because we don’t add new box tests, but instead: “are there some box tests that failed before this change and don’t fail after”.

To make this whole ceremony of managing extra files painless, we have such Kotlin script that generates everything for us, including the above plot.

See python/README.md#development for more details.

Can I try it out?

Sure! It’s enough to:

  • clone the repo
  • run ./gradlew dist
  • compile your Kotlin file to Python: dist/kotlinc/bin/kotlinc-py -libraries dist/kotlinc/lib/kotlin-stdlib-js.jar -Xir-produce-js -output compiler_output.py your_kotlin_file.kt
  • examine compiler_output.py and maybe consume it from some Python module

Please be prepared to be disappointed by the lack of basic features. It’s still too early to even experiment with it in pet projects, but it’s fine to check out the project’s progress on your own.

What’s next

Well, we are going to push it further bit by bit, hoping that we’ll keep the pace. Any help is greatly appreciated - the dev workflow described above makes it pretty simple to contribute. The goal for now is providing unidirectional Kotlin to Python compilation. You will be able to write simple console apps in platform-agnostic Kotlin running with the Python runtime, and simple libraries consumable from Python.

Concrete next steps:

  • create Python-specific implementation of Kotlin’s standard library. We now piggy-back on JavaScript’s one and it starts to be painful
  • find a simple, real-life platform-agnostic Kotlin library and work towards compiling it for Python
  • if we start dreaming: work with the authors of lets-plot. Currently, another solution to achieve “write in Kotlin, run with Python” is in place

Any sign of your support and/or interest in this project is a great motivator for us :slight_smile:

In case of any questions, feel free to reach us at #python or #python-contributors, depending if you’d like to use or contribute to the project.

14 Likes

Also, I would like to note that python is used in the most popular home automation system, there is a very large community there. One of the largest open source python development. It would be very cool if it would be possible to write code for smart home devices in Kotlin

1 Like

Hi Krzema, how does it compare to Graal Polyglot? I found a similar discussion here: Kotlin interop with Python (via C)

I’m not familiar enough with GraalVM to say for sure, but anyway: our project doesn’t involve any extra runtimes like GraalVM. You just have Kotlin code, transpile it to Python and that’s it - you can run it on any OS that has Python 3 (roughly - version support not thorougly tested, now we aim 3.8+). We’re also working on supporting MicroPython which has its own peculiarities.

In some cases installing extra runtime is either not possible or would mean extra effort for script’s user. For example, when writing application plugins which are tied to built-in Python interpreter.

I’m not saying that GraalVM is always a bad option - it may work fine for certain setups. However, we choose to go different path here, with all its pros and cons.