Hi, I was wondering if they are thread-safe such as Guava's. From what I can see in the source code it doesn't seem like that but I might be wrong. :)
Kotlin doesn't have really immutable collections, which have their immutability guaranteed by their contract. List<T> and Collection<T> are read-only collections. It means they cannot be modified through that interface, but an actual instance of that type can be a mutable collection, and can be modified externally.
Standard Library functions such as listOf() or extensions map() and filter() can return different implementations of List: either immutable EmptyList or immutable SingletonList, or mutable ArrayList, however this is an implementation detail and one should not count on it.
In case if their result is not being cast neither to a MutableList, nor to a java.util.List, it is safe to use it from multiple threads.
Okay, what I get is they are effectively immutable and not truely immutable (as long as you "follow the rules" and don't cast to a mutable type or use reflection in order to modify them), right? But then, it should only be safe to use them from multiple threads as long as you publish them in a safe manner (e.g. assigning them to a val field and then reading the reference through that field from a different thread). Anyway, that should be good enough for my current requirements. :)
Safe publication is just a general issue with concurrency one has to deal with, this goes for any type an not only collection classes. A pure immutable collection wouldn't solve any unsafe publication issues in this case. e.g., there is no difference in thread safety between:
var a = effectivelyImmutable() var b = purelyImmutable()
What are you talking about? A truly immutable object doesn't need to be published in a safe manner, they are thread-safe by nature (such as String). They normally use final fields (whose equivalent is val in Kotlin?) to enforce that.
PS: If you are talking about a truely immutable collection not necessarily being deeply immutable, well, that’s a different story.
Interesting that you take String as an example of a truly immutable object as that happens to be an effectively immutable object. The internal character array of String is not final and thus can be modified with reflection. But that's just nitty gritty details ;-)
Safe publication covers the topic of writing (and reading) a value to a field. Now it doesn’t really matter what the type of the field is or it’s internal structure for things to go haywire. Let me try to explain by some piece of code:
Let’s we have some field, which is not guarded by locks or volatile read/write semantics:
var field = null
If some threadA, running on cpu core A reads this fields, it stores the value on it’s local cache. Next we update this field from threadB running on cpu core B with a value:
field = whatever()
At this point we can hit some publication issues because we haven’t enforced any happens-before rules, namely:
- the value written by threadB might not have been written to main memory and therefor threadA will not see it
- reading the field from threadA might still see null as there is no need for it check with main memory (because it has read before)
There are some other cases concerning safe publication (which I don’t dive into now), namely:
- byte reordering of constructors
- two step writes of e.g. long (64 bit) values on 32 bit machines.
This is what safe publication is about, it’s about knowing for sure that you “see” the proper value on any giving time and not looking at stale or unintialized values. The fact that the object itself it mutable, immutable or effectively immutable doesn’t really matter.
Let's see, if you are talking about using reflection, then you can change pretty much anything in your program (even final fields), so I don't think that's a valid point for describing instances of the String class as a effectively immutable.
With regards to safe publication, well, there is other ways to achieve it but here I’m talking about assigning a reference to a final field and not changing it after that field is frozen (which happens right after the constructor of the containing field exits). You can read about the happens-before guarantees the Java specification gives you regarding that here: https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.5 (specifically read 17.5.1)
So in the end, it does matter whether you access an object from a different thread than the one that created the object through a final field or if you just pass it around in an unsafe manner (which again, it doesn’t matter for truly immutable objects because they are inherently thread-safe).
update: I wrote some incorrect stuff here. Concurrency, it always bites you in the ass.
Anyway, my main point was and still is. There is no difference in publication safety between purely immutable and effectively immutable in these collection classes as they are created through builder methods and those methods don’t have any shared state. So in this case it really doesn’t matter.
What I haven't emphasized by the way is the fact that I really like that you bring up this issues. In the end I also would like to see true immutable objects. Code should, as much as possible, be thread safe. So great that you bring up these points.
I don’t know what the best course for Kotlin would be and what the guys and girls at Jetbrains have in mind. There are several thoughts that have crossed my mind that influence my opinion:
- Interopability with java is still a very important factor. Kotlin should not introduce an incompatible collections library.
- The standard library should remain small as too big libraries have negative impact on startup times
- Immutability matters, alot. And therefor immutable object should really be immutable objects.
Well, they are not created through builder methods (or at least not what I would call a builder method), they are simply instantiated and the elements added afterwards (here is the code https://github.com/JetBrains/kotlin/blob/master/libraries/stdlib/src/kotlin/collections/Maps.kt#L34); not that it matters, though, because they should not change after they are returned (well, in this case you basically "need to promise" you won't hehe).
In any case, safe publication always matters if you need to share an effectively immutable object between threads and don’t want to rely on data races. By safely publishing an effectively immutable object you make sure other threads don’t see partial constructions or miss some writes due to optimizations and reordering of instructions (how your program looks at runtime as opposed to what you see in the source code). In the other hand, with truly immutable objects you don’t have this issue.
It could be that Java adds 'fences' at some point to the JDK (some methods are on Java8 Unsafe) - refer to: http://gee.cs.oswego.edu/dl/jsr166/dist/docs/java/util/concurrent/atomic/Fences.html and http://stackoverflow.com/questions/23603304/java-8-unsafe-xxxfence-instructions
My interpretation of fences for this case is that we could build something without final fields and explicitly use fences to publish that object with the same memory model guarantees that we get when using final fields. My feeling/guess is that we actually implicitly get that safe publication behaviour now in this builder type code without actually having to explicitly use a fence call - so we are getting the correct behaviour in terms of memory publishing but that this is not a strict guarantee but instead a JVM specific implementation detail.
If you are keen you can look to join https://mailman.cs.umd.edu/mailman/listinfo/javamemorymodel-discussion … that mailing list has all the Java memory model guru’s.
My thought on Kotlin immutable collections is that there is nothing more for Kotlin to do beyond what it does now (Kotlin compiler preventing mutation on collections that are deemed immutable). That is, the Kotlin compiler is preventing at compile time what the JDK immutable collections do at runtime (and barf with an exception at runtime) - that is, there is no practical benefit to wrapping the returned LinkedHashMap via Collections.unmodifiableMap().
So the ‘gap’ or potential ‘issue’ that is left is the case where a single collection instance is viewed in Kotlin as both a mutable collection and an immutable collection. This allows some code to mutate the collection at Runtime which could cause other threads reading/iterating the same collection instance to barf etc if the collection is not one of the Concurrent ones like ConcurrentHashMap. That is, if a collection instance is always viewed in Kotlin as immutable, the compiler prevents any code from mutating it and it is ‘effectively immutable’ and that is good enough in practice (that there is actually no practical requirement to go further to guarantee absolute immutablity with a fully final/immutable/safe publication data structure or wrapping as an unmodifable collection).
It can also be the case that this ‘gap’ or ‘issue’ is actually what you want to do. For example, you have a ConcurrentHashMap and some code is only allowed to read it and some code is allowed to mutate it and that is exactly what is desired and also thread safe.
That’s how I see it anyway … I don’t expect to see any changes in this area.
Some libraries (Boon?) like to access these internals in dangerous ways, so you might be well behaved but not everyone is being well behaved.
If someone makes the argument that string could be considered mutable if you use Reflection, and someone else says, “No that is bad behavior and we don’t do that”.
Then why can you say the same for “ReadOnly collections, we never cast them to be mutable and therefore to us they are immutable” which works on the same principle of “behaving”
Other point, use Guava or something that has immutable collections if you want them. No sense in Kotlin team spending time on libraries that can be done independently.
Did you even bother to read what's been discussed here (I don't even think you read the comment your are directly replying to)? No one is making such assumptions ?:|
… and someone else says, “No that is bad behavior and we don’t do that”.
I’m glad no one has made such statement here, that would be kind of silly.
Then why can you say the same for “ReadOnly collections, we never cast them to be mutable and therefore to us they are immutable” which works on the same principle of "behaving"
I don’t know what’s the “principle of behaving” you are talking about, but that’s not how Java works, in most cases you can only rely on the guarantees the Java specification provides (unless you are programming for a specific JVM/compiler/architecture you know the details of?, and even so…) which is part of what we were discussing on this thread.
I was making a point that immutable is no guarantee because (not just reflection) there are violations of it in various libraries in the Java world, including some that are popular. Hopefully none are overly stupid with what they do.
For the record, just like String the Guava ImmutableList is backed by a standard mutable Java array. It is just encapsulated and made threadsafe by preventing modification it. http://stackoverflow.com/questions/27476075/guava-immutablelist-what-exactly-is-backing-it