Why kotlin does not provide LinkedList implementation?

This isn’t the question though. The standard library is supposed to provide interfaces and implementations that are useful in most if not all cases and not just in some special scenario. As explained here LinkedList is only useful in special cases and therefor shouldn’t be part of the standard library.

You realize that LinkedList in the general sense of the word does not refere to just the persistent version. The oposite is the case, if you speak about linked lists most people (especially a comunity that mostly uses the jvm) will think of a mutable list (just like the java implementation. Also your argument has nothing to do with LinkedList vs ArrayList and more with persistent vs mutable.
That said koltin already provides a special library for immutable data structures (see link above).

I’m not really sure what your benchmark is supposed to show. You say use use a different version of scan than the one you supply so I can’t really confirm your results or whether or not your implementation is efficent for array based lists. In either case I would think that this again is a problem of immutable vs mutable and therefor not really relevant here.

Can someone explain why LinkedLists are useless? And more interestingly, can someone explain why they are harmful?

The simple answer is that linked lists are always slower than array-based lists on modern hardware. With array-based lists, elements are stored adjacent to each other in memory. With linked lists, they are not. Adjacent elements in memory can be loaded together into the CPU cache. Elements of linked lists cannot. If data is not in the CPU cache, the CPU has to stop working and wait for the data to be loaded from RAM. The CPU can sit idle for hundreds of cycles waiting for RAM. RAM is painfully slow compared to CPU cache memory. So you lose all the theoretical performance gains of a linked list and then some.

A long time ago, when CPUs were less sophisticated, it’s true that linked lists could be more performant for some types of insertion and deletion operations. However, it has been decades since that has been true.

This topic comes up quite often because a lot of CS books still (incorrectly) teach that linked lists are faster for insertion and deletion, so if you Google “linked list vs array performance” you can find lots of benchmarks across lots of languages, and quite a few good explanations (most better than mine here) about why linked lists are slow.

To be thorough, I should note that there is a difference in the performance of adding to the END of a linked list vs. adding to the END of an array: Both operations are O(1) in amortized time, but adding to an array might occasionally take longer due to the resize operation. It will still be faster in practice in nearly all cases due to CPU cache, but I suppose that theoretically if you were working with lists of millions of elements and needed to add to the end in constant time, a linked list might be the right choice. However, when you go to read back the data from the list (or do anything, really, besides adding to the end) performance is going to be painfully slow. In practice, in 25+ years of software development, I have never seen a case where performance improved through use of a linked list, even for “append-only” uses.

So, in a nutshell—you should really never use linked lists.

5 Likes

This is not completely fair to the linked list. For one, even an array list is just an array of pointers, to iterate it you still have to dereference them. So the difference is between one and two levels of indirection. Regardless of cache friendliness, data-dependent loads resulting from pointer dereference are always more expensive than loads by dead reckoning (pointer + index * object_size), which the prefetcher can predict.

Secondly, objects aren’t typically scattered all over the heap. The most typical pattern is create object - add to list - create next object - add to list - …. Since the JVM always allocates memory sequentially from a large contiguous block, we end up with a tight memory layout. Array list is again at a slight advantage because it doesn’t allocate an extra object every time.

Consider that, if you expand the array by doubling it and finish adding when the array is exactly full, every element will be copied just once (in amortized terms). In the worst case, if you finish adding just after the expansion, there will be two copies per element. The cost of copying one element in a bulk copy operation is extremely low and gets lower as the array grows: I clocked System.arraycopy of Object[] at 1-2 billion elements per second so the cost of one copy is about 1 nanosecond. Therefore the insertion overhead of array list is less than 5 nanoseconds. The linked list must allocate and initialize a brand new object every time. This pattern doesn’t receive the throughput benefits of a bulk operation and its cost is probably comparable to the above, could be even worse.

3 Likes

True, but insertions and deletions don’t require dereferencing those pointers. With a LinkedList, they do. This is where I was talking about CPU cache making a difference.

Also true, but I think with the caveat that generally if you’re adding to a list, you’re often also allocating the objects that will be added, and those allocations are likely to be interspersed with the LinkedList allocations (and if they are large, will push the LinkedList allocations apart).

Great analysis!

I think the lesson here is: Theoretical performance of an algorithm on a whiteboard or a CS textbook is often quite different from practical performance in a particular environment (in this case JVM on a modern CPU).

I for one am quite happy with LinkedList being unavailable from Kotlin. It’s possible to come up with contrived examples where it’s performance is similar to ArrayList, but not really any practical case where it’s better than ArrayList. I’ve had several times in my career where I’ve been able to achieve significant performance improvements by refactoring uses of LinkedList to ArrayList in Java code. With Kotlin, ArrayList is essentially enforced from the start, and I think that’s a good thing.

2 Likes

Not in stdlib now, but candidate: https://github.com/Kotlin/kotlinx.collections.immutable/blob/master/core/commonMain/src/implementations/immutableList/PersistentVector.kt

Persistent vector made of a trie of leaf buffers entirely filled with [MAX_BUFFER_SIZE] elements and a tail having from 1 to [MAX_BUFFER_SIZE] elements.

Not only that, the way Java Iterators work means that actually moving to the location still requires pointer chasing. Of course it also doesn’t play nice with caches on modern(ish) CPUs

1 Like

My benchmark shows that if you need sublists that can be modified independently from the original (or the other way around), array lists are a very bad solution.
The other version uses a very standard implementation of an immutable persistent linked list (too long to show here). The problem with array lists is that they are extremely slow to duplicate. When a method receives a list as its argument, there is no guarantee that the list won’t be modified after being received. Making a defensive copy is very slow and not even a good protection unless it is protected against concurrent access during the copy. Immutable lists solve this problem. And concurrent access is not the only problem. Bugs may also occur when a sub list is modified long after having been created.
The problem in most discussions about linked list is that the comparison is made between linked lists and array lists in array lists use cases. I am certainly not saying that linked lists are better than array lists. Immutable array lists solve a problem that array lists can’t solve. I understand they can be dangerous, but not more than array lists. I have encountered much more bugs due to concurrent access to array lists or unaware access to modified lists than to wrong use of linked lists.
Some use cases requires immutable lists and immutable linked lists are often much more efficient than immutable array lists.
But the main problem is the attitude of those who want to ban linked lists, either because they don’t know how to use them or because they think others don’t know. If they don’t know, they should learn, and if they think others are bad programmers who would make bad use of them, I would suggest to remove all dangerous constructs, such as loops, mutable references, exceptions, and more. If I chose to never use these constructs for safety reasons, would I be entitled to ask that they be removed from Kotlin?

Again. The original question wasn’t talking about an immutable version of LinkedList, but a mutable implementation like in java. Your benchmark doesn’t help there, because you are using an immutable version, which solves a different problem.
Also as explained above, kotlin actually has immutable collections (see linke above). They are currently in an experimental state and available as a separate library. AFAIK there have been discussions about merging this with the standard library once they are finished, but there are also arguments for leaving this as a separate library.

I haven’t yet seen an argument for a mutable linked list and I’m sure that most people here know that immutable constructs have performance benefits. No one here is arguing agsinst immutable linked lists. And the argument is not to remove all dangerouse constructs, but those that have extremely limited usecases and can also easily be implemented outside of the standard library. Mutable linked lists are a good example of such a dangerous construct, that shouldn’t be part of stdlib.

1 Like

Afaik, Garbage collection uses circular doubly linked lists. Arrays would be a problem to get for due to limited contiguous memory for this amount of data and would require to copy big amounts of data.
Personally, I would favor for (Single+Doubly) Linked Lists, they are traditional and have use cases in certain aspects.
Further, user defined extensions both of them can reuse the implementation in the standard library.

Most users won’t use LinkedLists anyway because listOf calls use the standard implementation, which are ArrayLists.

1 Like

Very interesting discussion!

So, my understanding is that:

  • All actual implementations are still mutable behind the scene for now.
  • Immutable is just a wrapper of default mutable implementations.
  • We want to have real immutable implementations for useful data structures in the future. This is still in progress here
1 Like

One of the common use cases is to implement ordered HashMap that preserves insertion order of the keys with doubly-linked list LinkedHashMap - Kotlin Programming Language

mutableMapOf() already does this.

2 Likes

Linked lists are not “useless” as a concept in general. In specific use-cases, they provide better theoretical time complexities:

  • Insertion in the middle is O(1)…
    • …but with Java List interface, it’s practically impossible to actually utilize this potential. You’d need a C+±like API with C++ -like iterators. Java List doesn’t offer that.
  • Deletion in the middle is O(1)
    • …but, again, it’s tricky to utilize this potential in Java. The iterator-based deletions are the only option I can think of. Otherwise, you’d need a C++ -like API, again.
  • Appending is non-amortized O(1)
    • Array-based lists offer amortized O(1) appends, meaning that some appends will need C * N steps because of the buffer reallocation. In some rare but possible situations, this might be unacceptable. You might need guaranteed O(1).
    • …but it can be argued that in these rare contexts the uncertainties related to garbage collection already disqualify technologies like Java / Kotlin.

As always, theoretical time complexities require a sufficiently large N (here, list size) to be noticeable. For smaller Ns, array-based lists may take less real time to execute even with “worse” big-O complexities.

So, summing up, linked lists aren’t useless in general. With a proper API, there are use cases where they can save your day. This is still a minority of all use cases.

Without an API that can utilize the potential of linked lists, linked lists are indeed close to useless. Java’s LinkedList is an API that can’t utilize the potential of linked lists.