kotlin
kotlin copied to clipboard
KT-77646 - Optimize copyOf(newSize) function for primitive arrays
See https://youtrack.jetbrains.com/issue/KT-77646
cc @JSMonk
@lppedd nice job! The new implementation outperforms the old one almost always. The only exception is when the newSize is relatively small (empirically, < 32), when the old implementation is significantly faster. Could you please check if something could be done for shorter copies?
@fzhinkin thanks for looking into it. Did you notice the slowdown only for the set branch, or for both set and slice?
Admittedly I didn't consider very small arrays, but I guess the only thing we could do here is pick an arbitrary size boundary (e.g. the 32 you mentioned) and add another branch.
I've run a bunch of new benchmarks https://gist.github.com/lppedd/57d99f5f73acd7f37b96c13f08251361
Basically the TL;DR on my PC is that if the loop is about 20 or less iterations, it will perform slightly better than slice or set.
So what we might want to do is something like
public actual fun ByteArray.copyOf(newSize: Int): ByteArray {
require(newSize >= 0) { "Invalid new array size: $newSize." }
val size = this.size
return when {
// 20 or whatever other hardcoded limit we want to set
newSize < 20 || size < 20 -> fillFrom(this, ByteArray(newSize))
newSize > size -> ByteArray(newSize).also { copy ->
copy.asDynamic().set(this, 0)
}
else -> this.asDynamic().slice(0, newSize)
}
}
Tho I'm not a big fan of complicating functions like this, but I don't have a better suggestion.
@lppedd
Did you notice the slowdown only for the
setbranch, or for bothsetandslice?
It seems like either of branches could be slower, the main factor here is the number of copied elements.
@fzhinkin yup I've noticed. I've also updated the code sample in my latest comment about the perf issue to use a when expression. But yeah I don't have other ideas.
Thank you!
In the meantime I've experimented with a slightly modified version to reduce the conditions to evaluate by using Math.min.
public actual fun ByteArray.copyOf(newSize: Int): ByteArray {
require(newSize >= 0) { "Invalid new array size: $newSize." }
val size = this.size
val limit = min(size, newSize)
return when {
// 20 or whatever other hardcoded limit we want to set
limit < 20 -> fillFrom(this, ByteArray(newSize), limit)
newSize > size -> ByteArray(newSize).also { copy ->
copy.asDynamic().set(this, 0)
}
else -> this.asDynamic().slice(0, newSize)
}
}
private fun fillFrom(src: dynamic, dst: dynamic, limit: Int): dynamic {
var index = 0
val arr = dst.unsafeCast<Array<Any?>>()
while (index < limit) arr[index] = src[index++]
return dst
}
But there is close to no difference as far as I can tell, and the Math.min version might be worse with 20+ elements.
Double checked results w/ Bun being used instead of NodeJS, and also checked a plain JS code copying arrays in browser (https://jsperf.app/maxije): for short copied, loop is always better :(
@fzhinkin damn, well at least the behavior is consistent between implementations!
Are we ok in going ahead with the reworked solution?
Also worth adding a comment to explain why the additional branch is there.
@lppedd, yep, the reworked solution looks good. I think we can stick to relatively small threshold value, say 16. In future, if it ever be a problem, we can always reconsider it.
I think we can stick to relatively small threshold value, say 16
Adjusted and force pushed. Thanks!