Should max()/maxBy()/maxWith() return the last largest element?

Currently, if a collection has more than one instance of the largest element, max(), maxBy() and, maxWith() returns the first one found (like min(), minBy() and, minWith()). I would argue that the right behavior should be to return the last largest element instead of the first one (but Alex Stepanov does a better job explaining why :slight_smile:).

Even one of the example problems provided on “Try Kotlin” seems to support this. On this problem, the code:

fun indexOfMax(a: IntArray): Int? {
    return a.withIndex().maxBy { it.value }?.index
}

won’t pass all the tests, failing on these two cases:

indexOfMax(Integer.MAX_VALUE, Integer.MAX_VALUE, Integer.MAX_VALUE) is 0, not 2.
indexOfMax(Integer.MIN_VALUE, Integer.MIN_VALUE, Integer.MIN_VALUE) is 0, not 2.

This can be easily fixed replacing < with <= on the corresponding templates for max, maxBy and maxWith defined in kotlin/libraries/tools/kotlin-stdlib-gen/src/templates/Aggregates.kt.

Is that an oversight or is on purpose?
Should I create a PR for this?

2 Likes

Could you summarize the reasoning from that video?

Ok, I’ll try!

Imagine you have a sorted collection with different elements. Intuitively, the min element should be the first one and the max element should be the last one. min() and max() return the correct object in this case.

Now, imagine that the sorted collection have all instances that are equal. By the same reasoning than before, the min element should still be the first one and the max element, the last one. min() and max() both returns the first one.

In a more theoretical sense, min() and max are different algorithms. You could get the current behavior of max() by using minWith with this greater than comparator:

val greaterThan = object : Comparator<Int> {
    override fun compare(x : Int, y: Int) = y - x
}
val maxElem = a.minWith(greaterThan)

But if you are interested in the last smallest/largest value, you could use max with the changes I proposed (and the appropriated comparator to get the smallest value) .

Usually nobody cares which min or max value do they get… but if you do, I think the expected behavior is not what max currently does (as I mentioned previously with the example from the tutorial).

Alex explanation, although is for C++, is probably clearer and surely better but I hope this briefly covers the main points.

To make it even shorter. What I read what you want is that min and max behave as if you sorted the list with a stable sort algorithm, and then took the first or last elements. (But obviously faster as min/max even in this case can be done in a single pass).

Exactly.

And you really need it, you could always choose to use minWith() or maxWith() with an oposite comparator to get the first largest element or the last smallest element respectively (if you know there could be more than one and the default behavior is not the one you want).

All searching functions (find, etc) return the first match they find, so this is the expected behaviour for min and max. However good your argument might be, it hardly reflects common thinking.

If you want the last maximum to be returned, this should be stated clearly in the code:

listOf(1, 2, 2).asReversed().max()

Reversing a sequence that is not a list can be expensive in that the reverse needs to be stored in memory. Of course the easier solution is to define the comparator such that it defines a full order instead of a partial order (the left is smaller than the right) - just be aware that the validity of that depends on implementation details (in what order are elements compared).

Such a left is smaller than right comparator is not valid for all sorting algorithms or all containers (if they sort) as it does not have the expected reflexivity property.

Most robust is probably to write your own min and max implementations that have the desired behaviour.

1 Like

list.asReversed() does not take any additional memory or resources. Please see the implementation.

A sequence that is not a list (and therefore does not have random access) needs to basically be stored in a list to be able to reverse it. A key aspect of min and max is that it works on sequences requiring only one memory element (the current min or max element).

1 Like

Apparently this kind of thinking is more common that you would expect.
As I mentioned in the first post, see this problem and
the tests that it needs to pass (especially testIndexOfMaximum6() and testIndexOfMaximum7()) :slight_smile:

Anyway, as @pdvrieze mentions, asReversed() could be potentially expensive on sequences that can only be iterated using an Iterator object.

This symmetry between min/max and first/last on sorted list looks very convincing.

Could you provide a use case, when it’s crucial to return the last maximum instead of the first?

1 Like

A better real-world use case than the one mentioned in this PR comment would be this:

Suppose I had a list of a data class: Album(noOfSongs: Int, noOfPlays: Int) as

[
    (15, 20),
    (9, 15),
    (10, 20)
]

And I wanted to get the song with the most amount of plays but also sorted by the number of songs on the album so if two albums been played the same amount of time, I would then use the number of songs on it to decide which one to return since the same amount of plays on a smaller album might indicate more preference. So I sort by -noOfSongs first giving me:

[
    (15, 20),
    (10, 20),
    (9, 15),
]

Running maxBy on this list would give me the album with 15 songs and 20 plays (15, 20) even though my desired output for my use case would be the album with 10 songs and the same number of plays (10, 20).

I don’t think this is a good example. Firstly I’m sure I can easily come up with a scenario where you want the oposite behaviour. In any case you wouldn’t want to implement the search as you described. The problem with your method is that sorting a list is quite slow compared to just looking for the best element. It would be much easier to just define a proper comparator that first compares the number of plays and then the number of songs instead of sorting the list first.

Sorry, I should have clarified, this scenario is a use case for an alternative lastMaxBy being used side by side maxBy and not a replacement. So a scenario with the opposite behavior would still use maxBy.

Also, the part about sorting the list is just to show that the list might have previously been in a predefined order. I could be fetching the list of albums from an API or library that have already been sorted by the number of songs and I intend to keep that order when I have more than 1 album with the same number of plays.

Looking at this a bit over 3 years later I would say that to have minBy return the first match and maxBy return the last element is a sensible design decision. This is based upon the idea that at some level the decision between the two options (although in case of 3 it may also be possible to return the 2nd) needs to be made. The alternative options have no real justification, these ones have some (in that for a sorted list they work “as expected” and there is no real implementation cost). There is however a compatibility issue that may mean that changing the choice is no longer possible (it could break software that assumes current behaviour).

Yeah, you’re right. It might break behavior for implementations that already expect the current behavior and depends on them for proper functionality, which was why I proposed an alternative implementation.

If the alternative I proposed has no real justification though, do we conclude that the current implementation should just be left as-is?

If you want to get an album with the maximum number of plays and when there’s a tie, with the minimum number of songs, it would be much more easy, efficient, and reliable to construct a custom comparator:

data class Album(val noOfSongs: Int, val noOfPlays: Int)

fun main() {
//sampleStart
    val albums = listOf(Album(15, 20), Album(9, 15), Album(10, 20)).shuffled()
    val comparator = compareBy<Album> { it.noOfPlays }.thenByDescending { it.noOfSongs }
    
    println(albums.maxWith(comparator))   
//sampleEnd
}
2 Likes

Like I said here before, the list could also be being fetched from an external source you have little control over. In this case, maxWith() would have to use one comparator and would have the same result as maxBy() .

Maybe you could use withIndex() and also sort on that?

Why is that the case? Are you saying that you can’t construct a comparator with two criteria as in the sample above?