Remove repeating values

samtsao · October 15, 2022, 5:02pm

Hi! My question is not about removing duplicates from a collection. It’s about removing duplicates only when they occur in series in a collection.

For example:
[1,2,2,3,2,1] → [1,2,3,2,1]
["A","A","B","A","C","C","A","C"] → ["A","B","A","C","A","C"]

I want to do this in a recursive / functional way and have implemented this as so:

fun removeRepeats(list: List<Any>): List<Any>{
    val last = if(list.size>2) removeRepeats(list.drop(1)) else list.takeLast(1)
    val out = if(last.first() == list.first()) last else listOf(list.take(1),last) .flatten()
    return out
}

I’m wondering if this was the best way to write this function recursively. I don’t know if the if statements are really necessary.

Thanks!

broot · October 15, 2022, 8:51pm

I’m not sure if doing this with recursion is a good idea. Please be aware Kotlin doesn’t support linked lists that you can split into head and tail. In your implementation you make a full copy of the list twice with each iteration. I suggest using functions like filter, map, reduce, etc. instead. Example implementation:

list.take(1) + list.zipWithNext()
    .mapNotNull { (prev, curr) -> curr.takeIf { curr != prev } }

This implementation copies the data at the last step, to include the first item. This can be avoided by making the code less readable. We can either use sequences, so replace both list with list.asSequence() and then add toList() at the end. Or we can use fold with mutable list as an accumulator, but we then have to somehow access the previous item.

Some time ago I suggested a change to Kotlin stdlib to include something like runningReduceNotNull(). With such a function, this would be really easy to implement: https://youtrack.jetbrains.com/issue/KT-51251/runningFoldNotNull-and-runningReduceNotNull

tlin47 · October 15, 2022, 8:53pm

Well, “best” is quite subjective - or better said dependent on the criteria you strive for.

Under the given condition that it has to be recursively (which I would normally not recommend for this kind of function), I have three improvement suggestions:

The function should handle the empty list correctly. An early return is helpful for that and is great for readability.
The function should return a list with more specific element type so that the element type does not get lost when calling the function. Generics can handle that.
I would rename the variable “last” to “tail”.

Here is one solution that takes these three improvements:

fun <T> removeRepeats(list: List<T>): List<T> {
    if (list.size < 2) {
        return list
    }
    val head = list.first()
    val tail = removeRepeats(list.drop(1))
    return if (head == tail.first()) tail else (listOf(head) + tail)
}

Edit: broot was faster.

tlin47 · October 15, 2022, 9:07pm

Another additional simplification could be to define the function as extension function:

fun <T> List<T>.removeRepeats(): List<T> {
    if (size < 2) {
        return this
    }
    val tail = drop(1).removeRepeats()
    return if (first() == tail.first()) tail else (listOf(first()) + tail)
}

tlin47 · October 16, 2022, 12:35am

For academic purposes, here is another solution that prevents creating a lot of copies and is still recursive:

fun <T> MutableList<T>.removeRepeats(reversedIndex: Int = 0) {
    if (size < reversedIndex + 2) {
        return
    }
    val index = size - 1 - reversedIndex
    if (get(index) == get(index - 1)) {
        removeAt(index)
        removeRepeats(reversedIndex)
    } else {
        removeRepeats(reversedIndex + 1)
    }
}

The solution does the changes in-place, especially it also changes the original (mutable) list. If it has to work without changing the original list (that probably isn’t mutable), it can be wrapped to copy exactly once:

fun <T> removeRepeats(list: List<T>): List<T> {
    val result = list.toMutableList()
    result.removeRepeats()
    return result
}

However, this whole solution is technically still recursive, but it really is a loop that uses function call stack for the iterations - so it is actually more of an indexed loop than a classical recursion.
Therefore, it probably is only of theoretical interest as a solution that is recursive without creating a lot of copies of the list - but it is still inferior to a not-recursive solution that directly uses an actual loop or for clarity and readability uses predefined functions (that internally do the loop) like the one that @broot showed.

arkanovicz · October 16, 2022, 3:02am

@tlin47 For academic purposes, it should be noted that removing items in the middle of a list may perform a lot of copies…

tlin47 · October 16, 2022, 3:36am

Indeed, depending on the list implementation, any change on the list may perform copying. And removing in the middle of the list is one of the operations where it is quite likely - although for most common list implementations it only requires moving around some elements instead of copying the full list (creating a copy and refilling). This results e.g. in a difference in space vs. time complexity.

A solution that guarantees a minimal number of shifts would have to be tailored to the underlying list implementation - which would also take away the generality.
But like said, the given solution was not one to be actually used, more to prove a point about that it can be done recursively without having to copy in every step (it moves around elements only when needed) - but thereby getting a less-readable solution.
In any case preferable would be a solution that focuses on readability.

Topic		Replies	Views
Idea: kotlin.collections.repeat function Language Design	7	2731	December 29, 2017
Kotlin distinct() from 2 Dimensional array Support	3	1084	December 2, 2020
Using sequences to avoid copying large collection - Performance Support	3	2509	April 2, 2022
Variables updating in lists when they shouldn't Support	1	736	April 22, 2021
Reductions / cumulative sum	17	6723	July 5, 2018

Remove repeating values

Related topics