Very poor performance of immutable maps

I think that supporting persistent data structures ala Clojure as the default implementation that the kotlin stdlib resorts to for Map and List is a very reasonable request.
It makes functional programming in kotlin feasible for much bigger problems, out of the box without resorting to external libraries.

5 Likes

Adding an item to a map is expected to be an O(1) operation. Kotlin has an operation which allows you to add an item to a read-only map; therefore it’s reasonable to expect that this operation is O(1). You can only implement this via an immutable implementation, so the fact that Kotlin has this operation at all leads you to expect that there’s an immutable implementation. The documentation uses the word ‘immutable’ in several places, which reinforces this assumption.

The fact that Kotlin’s implementation is not, in fact, immutable, and the addition operation is O(n), is very weird. The fact that it’s implemented as an extension function is the reason why it’s weird, but that’s just an implementation detail — it’s not relevant to the shape of the API. The fact is that the API contains an operation which users will reasonably expect to be O(1), but which is actually O(n).

This should not have come as a surprise to me. This sort of thing needs to be clearly and explicitly called out in the documentation, so that doesn’t come as a surprise to anybody else.

1 Like

The term immutable doesn’t have any performance implication. So the documentation is not wrong.
The argument is that Kotlin could do much better by using state of the art persistent data structures when dealing with immutable Maps/Lists

4 Likes

I’ve never seen that use of the word ‘persistent’ before — in all the literature I’ve seen, ‘immutable’ has been used instead (see, e.g. ImmutableCollectionsExplained · google/guava Wiki · GitHub). ‘Persistent’, however, has always meant that the data was stored on disk.

For everything I’ve written above, please consider me to have said ‘persistent’ when I said ‘immutable’.

Yeah, it’s confusing, though it’s how the CS literature calls them since '86

Yeah, I think it depends on where you come from — I was at St.Andrews when they were doing work with the other kind of persistence (e.g. Orthogonal Persistent Object Systems). But persistent is obviously the term used here, so I’ll try to use that from now on.

By the way, it seems like there is some effort underway in that direction GitHub - Kotlin/kotlinx.collections.immutable: Immutable persistent collections for Kotlin

2 Likes

I can accept that the documentation was not sufficient for you @david.given. It was sufficient for me. It is arguable how confusing it is for the general public. I’d assume not as much as you are trying to suggest.

It would be good to maintain the positive vibe that the Kotlin community has always had. If @david.given is taking the trouble to comment here, there is probably an opportunity for improvement.

Also, one thing that has benefitted US-based tech companies in becoming world-famous is that customer service is generally good in the US and companies try to be attentive to users/customers rather than just saying “you’re wrong”.

1 Like

David said that “Adding an item to a map is expected to be an O(1) operation.”, but I don’t think that’s a reasonable expectation. In fact, I don’t think you should make any assumptions about performance in general.

For example, in the simpler case of lists, appending an item to a list could be O(1) if the list were doubly-linked; amortised O(1) for an ArrayList, or O(N) for a singly-linked or FP-style list. (Those last would be much faster prepending, of course.) And for maps, there could be even more variation.

So it all depends on the particular implementation. And for that, you need to check the documentation.

This is example is rather like the classic Java gotcha of appending to a string in a loop; that also involved an overload of a plus sign, and that also needed you to understand what the code was actually doing.

I agree, people with different backgrounds have different expectations. For example, for me it seems absolutely unreasonable to append elements to read-only (or even immutable) map. Kotlin developers in this case follow good old java tradition that if you need something special, you need to declare it explicitly and I am completely opposed to breaking this tradition in favor of implicit “universal” solutions.

As for @jpliska comment, it is not quite relevant since there were no “negative” comments, just the regular discussion, and people in this thread are not customer service. There was only one completely neutral comment from JetBrains team member. By the way, I do not know, what US have to do with that.

I agree with the idea that if you want something special you need to declare it. That said maybe it’s worth thinking about changing the operations on read only lists to use kotlins immutable lists once they are a stable feature and not just a prototype. Until then there is not much we can do, except maybe improve the documentation.

1 Like

I will assume that once immutable lists are stable, they will stay a separate kotlinx library. Like serialization, reflection or kotlinx-html.

Will it be feasible to change operations on read-only lists in such a case?

Documentation is not for folks whose understanding of the language is already deeper than what is described in the documentation.

Documentation is supposed to help people who do NOT have a “deeper understanding” come to have a deeper understanding.

Kotlin’s documentation could definitely be more clear and explicit on this point. If beginners (with tons of experience in other languages) are confused, the documentation should change.

1 Like

Have you read what I said about the documentation? Because I don’t think you have. I pointed out that

  1. people jump to conclusions about the language even though they are not experts
  2. I explained why those conclusions where wrong in this case and even linked some parts of the documentation to back this up.
  3. I also agreed with your point there and stated where I would improve the documentation.

I have not yet had the time to actually write a draft. But maybe you could be a bit more specific about which part of the documentation you would like to change. I don’t think adding a note to every function explaining how it works internally is the right way to go. I understand however that many people are confused with parts of kotlin because they make assumptions based on their wrong idea that some structures are immutable even though they are read only.

I think that we have a terminology confusion here. If you do not have functional language background, immutable and read-only are mostly the same. On the other hand people with some experience in functional programming suppose that immutable collections are supposed to have cheap operations to create new immutable collections with changed elements. I agree that documentation should cover this question not to mislead those people, but I also do not thing that some things should be that way or another just because someone assumes them so based on previous experience not relevant to the problem.

By the way, kotlin documentation is really great. One of the best (if not best) for all languages I’ve ever seen.

when adding an element to an immutable collection… You should be aware you are gonna clone it… so yes, performance are worse aggregating in immutable collection than in mutable collection.

the only simple way to do that, to use single link collection to add 1 element in head (I’m not even talking about changes…)… But then access won’t be O(1)… except if you index items… but this would be bad too… this would be terrible either in read or in write… So definitly no, they should not suppose this.

Complexity of ops in the doc would be certainly helpful, see for example how it’s done for C++:
https://en.cppreference.com/w/cpp/container/map/insert#Complexity

1 Like

The complexity depends on the implementation (like that one). It could not written in general case for the interface.

The doc in question mapOf) - is it for implementation or interface? There is a way to run it so I would tend to think it is for the implementation. BTW, complexity could be detected automatically.