Zippy parallel vector additions in Kotlin (seems like coroutines + locking = slower)


#1

Disclaimer: this may not have a good answer.

I’m working with BufferedImages, which means slooooow code, but code that screams “you have more CPU cores, please run me in parallel!”

Sometimes I’m getting a BufferedImage pixel->per-pixel-hue:IntArray… which means calling a function for each pixel:

(raster.dataBuffer!! as DataBufferInt).data.asIterable().map { pixel ->
  getHue(
	  red = pixel shr 16 and 0xFF,
	  green = pixel shr 8 and 0xFF,
	  blue = pixel and 0xFF
  )
}

Other times I’m performing some “vector math” on the RGB values results to get running sums between frames:

(colorImage.raster.dataBuffer!! as DataBufferByte)
	.data.asIterable().chunked(if (colorImage.alphaRaster == null) 3 else 4)
	.forEachIndexed { pixelLocation, channels ->
	    // ignore alpha channel 3 if it exists
	    red[pixelLocation] += channels[2].toInt() and 0xFF
	    green[pixelLocation] += channels[1].toInt() and 0xFF
	    blue[pixelLocation] += channels[0].toInt() and 0xFF
	}

Kotlin is fantastic for this: nice chunked operators, simple map syntax. I’m very happy.

But images are big, so it is slow. I’ve got a large collection of stuff to iterate over, and I’ve tried making it go faster using parallel maps with coroutines, but it slows way down. I’m guessing this is due to locking, or memory bottlenecks, or "you can’t beat a for(i in 0..size) { ... }" loop.

Is there a way in Kotlin to burn up all my cores in such a way that would result in faster completion time? I’m fine with using 4x more electricity if it means the app goes 2x as fast.


#2

You may try:

runBlocking {

	(colorImage.raster.dataBuffer!! as DataBufferByte)
		.data.asIterable()
		.chunked(if (colorImage.alphaRaster == null) 3 else 4)
		.mapIndexed { launch { pixelLocation, channels ->
		    // ignore alpha channel 3 if it exists
		    red[pixelLocation] += channels[2].toInt() and 0xFF
		    green[pixelLocation] += channels[1].toInt() and 0xFF
		    blue[pixelLocation] += channels[0].toInt() and 0xFF
		} }
		.forEach { it.join() }
}