Coroutines cannot sum large number ranges right

mogrifier · August 9, 2019, 7:35pm

I am running some very simple code based on the kotlin coroutine examples at https://kotlinlang.org/docs/tutorials/coroutines/coroutines-basic-jvm.html

Kotlin coroutine sum numbers properly up to a point and then fail. It works on summing 1…10,000 but fails on 1…100,000 and above. The sample code page says for summing 1…1,000,000: " Now it prints something sensible: 1784293664 , because all coroutines complete."

But that is very wrong.

1 to 1,000,000 is:
499,999 * 1,000,000 + 500,000 + 1,000,000 = 500,000,500,000

My environment is Intellij on a Mac running openjdk (jdk-12.0.2.jdk) and kotlin plugin 1.3.41-release-IJ2019.2-1 with the kotlinx-coroutines-core-1.1.1.jar. I am producing java 12 bytecode and using the kotlin 1.3 compiler.

Code below outputs:
Sum 1…1,000,000 (this kotlin coroutine answer is wrong): 1784293664
simple loop method 1…1,000,000 (correct answer)= 500000500000

So what is going wrong? I am new to kotlin and so wonder if I have made an error.

Here is the code:

package unleashbts

import kotlinx.coroutines.*
import java.util.concurrent.atomic.AtomicLong

fun main() {

val deferred = (1..1_000_000).map { n ->
    GlobalScope.async {
        n
    }
}

runBlocking {
    val sum = deferred.sumBy { it.await() }
    println("Sum 1..1,000,000 (this kotlin coroutine answer is wrong): $sum")
}

//alternative for loop
val d = AtomicLong()
for (j in 1..1_000_000L)
{
    d.addAndGet(j)
}
println("simple loop method 1..1,000,000 (correct answer)=  ${d.get()}")

}

Eliote · August 9, 2019, 8:18pm

sumBy works on Int values. You need to use sumByDouble.

mogrifier · August 9, 2019, 9:35pm

I see now it is out of the integer range, but just trying sumByDouble causes other problems with the example code. Must be some other way to do it right. Just a bad example, I guess.

Eliote · August 9, 2019, 9:37pm

val sum = deferred.sumByDouble { it.await().toDouble() }.toLong()

nbness1337 · August 10, 2019, 5:37pm

I have also found problems with sumBy and sumByDouble because of their small ranges and double’s awkward granularity once you get in the higher numbers.

A function similar to this should help you out for integral types. Not sure why a function like this isn’t already in the standard library though.

fun Collection<Int>.sumByLong(): Long {
    var sum = 0L
    forEach {
        sum += it
    }
    return sum
}

HughG · August 10, 2019, 6:42pm

Or something like

fun Iterable<T>.sumByLong(): Long =
    fold(0L, Long::plus)

?

darksnake · August 11, 2019, 6:46am

Because it is rarely needed and when needed, it could be easily implemented by extension. We are not in Python world, where everything outside numpy does not work (fast).

Also you should remember that there is a lot of ways to do sums. For example, in extremely large collections, you probably want to segment the list and do it in parallel.

darksnake · August 11, 2019, 6:48am

It will do, but it will be slower (significantly) due to boxing. It needs to be tested though because initial values are also boxed. And it is hard to predict, how good is JVM inlining.

Varia · August 11, 2019, 7:45am

If you actually need this particular sum, then do not forget that 1+2+…+n = n*(n+1)/2.

fvasco · August 11, 2019, 10:35am

forEach and fold get the same performances.

darksnake · August 11, 2019, 10:54am

Interesting. I think that forEach on LongArray should be significantly faster, but for boxed values they could give the same result.

fvasco · August 11, 2019, 11:04am

Sorry @darksnake,
I have to explain better.

Both forEach and fold are inline functions, so both performs well like a plain for (n in list) sum += n
There is no boxed Long, take a look to bytecode

darksnake · August 11, 2019, 11:15am

No time for that now, but I believe you. The behavior of function references is not always obvious though.

Topic		Replies	Views
Summing Arrays faster when split across coroutines	2	1583	June 11, 2020
Sum of Array	12	12272	March 1, 2018
BigInteger, Range, Stream, and iterators Libraries	11	6208	February 24, 2016
Performance Problem	8	3842	January 17, 2018
Arrays, loops Support	2	623	January 24, 2023

Coroutines cannot sum large number ranges right

Related topics