LongRange.count() returns Int and cause an Exception

There is a LongRange, which exposes a method count(), by Iterable<T> extension:

package kotlin.collections

// file: _Collections.kt

public fun <T> Iterable<T>.count(): Int {
    if (this is Collection) return size
    var count = 0
    for (element in this) checkCountOverflow(++count)
    return count
}

But calling this method may cause an exception, as the return type is Int.

Simplest example to cause the exception:

 (0..Long.MAX_VALUE).count()

Exception:

Exception in thread "main" java.lang.ArithmeticException: Count overflow has happened.
	at kotlin.collections.CollectionsKt__CollectionsKt.throwCountOverflow(Collections.kt:485)
	at kotlin.collections.CollectionsKt___CollectionsKt.count(_Collections.kt:1772)
	at aoc.day5.IngredientsInventory.countAllFresh(Day5.kt:29)
	at aoc.day5.Day5Kt.main(Day5.kt:91)
	at aoc.day5.Day5Kt.main(Day5.kt)

Should it be considered a bug? :slight_smile:

General rule in Kotlin/Java, but also many other langs is that the size of arrays or collections is limited to 32 bit. It is just not very useful to go into bigger sizes due to memory and CPU.

Ranges are a kind of special case, and the only question is if it makes sense to provide a specific size/count operation for them. As you noted, currently they don’t even provide a special implementation, so count() is not based on math, it literally goes item after item and counts them. It is not feasible to count items like this over the 32 bit limit.

BTW, (0..Long.MAX_VALUE).count() would overflow even a long.

3 Likes

It’s clear why this happens, but the question remains unanswered: should this be part of the public API?

So this is only my personal opinion, but I think not. First of all, we shouldn’t think in terms of: “because it is a range of longs, count should be a long as well”. Type of the range and the type of the size are two separate things, we don’t expect ('a'..'z').count() to return a char after all.

Second, while I see nice benefits of this, e.g.: (0 .. 1000000 step 7).count(), I think it is not worth added complexity, breaking the standard that sizes are integers and it definitely doesn’t pass the minus 100 points rule. But again, this is only my opinion.

1 Like

I’m not exactly sure what you’re asking here. First you asked if the demonstrated behaviour should be considered a bug, now you’re asking if it should be part of the public API. If WHAT should be part of the public API?

Well,

If this is a bug (the method throws an exception for valid input), then I believe it should be fixed - and I’d be happy to report it. If it’s not considered a bug, then I’m unsure whether this behavior should be part of the public API, because valid code can end up throwing an exception.

I fully agree with @broot, that count() usually returns an Int, and that its implementation for ranges is inefficient, since it effectively iterates through the entire range. But sometimes it is actually useful (and work as expected).

However, the current situation - where it’s easy to write code that unexpectedly throws - doesn’t seem ideal to me. I’d compare it to Guava’s ImmutableCollections: you can call collection.add(item), but you’ll get an exception, which feels counterintuitive.

So what am I asking? I’d simply like to understand the reasoning behind this public API design before deciding whether to report a bug or submit a feature request. :slight_smile:

@przemek.materna

count’s documentation should reflect this behavior, and in that sense it’s a bug.

Another problem with count’s behavior is that for collections containing more than Int.MAX_SIZE elements (which is not something that practical, although I can imagine some reasonable Set implementations containing that many elements) the function will return Collection.size’s value, which is unspecified for such “huge” collections. :neutral_face:

In my personal opinion, this is actually backwards. Java has this artificial limit of 31-bit (signed) arrays or collections, while most languages evolved and use a separate “size integer” type (C/C++ size_t, Rust/Zig usize) or default to 64-bit int (Go, Swift, Julia). I could agree some 15 years ago, but the 2GB limit for a ByteArray is not practical anymore and sometimes even hinders performance. Also, 2^32 is not as big of a number some people make it to be - A 4.2GHz processor does the same number of clock cycles in a single second.

On a related note, I can’t even tell why in Java FileChannel.map takes long size but has an explicit requirement that the value:

must be non-negative and no greater than Integer.MAX_VALUE

Sadly, it’s way too late to correct the behavior on the JVM. I believe the exception thrown here is correct – count did indeed overflow – but the documentation should reflect this behavior (if the number of elements is greater than Integer.MAX_VALUE the function throws) and consistent for large collections as well.