kotlinx.collections.immutable
kotlinx.collections.immutable copied to clipboard
Iterable.intersect is very slow with PersistentList
Iterable<T>.intersect(other: Iterable<T>) takes a very long time to complete when called with a PersistentList as a parameter. Same function works faster with other iterables like List and Set. It is minutes with PersistentList and milliseconds with List.
I couldn't find the exact reason for that, but it seems that Collection.retainAll does something with the persistent list which takes ages to complete.
Here are some examples:
(0..147853).toList().intersect((0..147853).toList()) // takes milliseconds
(0..147853).toList().intersect((0..147853).toPersistentList()) // takes minutes
(0..147853).toList().intersect((0..147853).toPersistentList().toSet()) // takes milliseconds
(0..147853).toMutableList().retainAll((0..147853).toPersistentList()) // takes minutes
(0..147853).toMutableList().retainAll((0..147853).toPersistentList().toList()) // takes milliseconds
Hello,
retainAll calls Collection.contains(). The complexity of contains() is O(1) or O(logN) for sets and O(n) for list.
So, to be honest:
-
I was surprised that (0..147853).toList().intersect((0..147853).toList()) takes only milliseconds
-
I was not surprised that (0..147853).toList().intersect((0..147853).toPersistentList()) takes minutes.
But the implementation of MutableCollection.retainAll(elements: Iterable) tries to be smart: in some cases, 'elements' is converted to a set and retainAll is applied using this set. It explains why the test case with two lists is so fast.
This behavior is handled by the following code from Iterables.kt
/** Returns true when it's safe to convert this collection to a set without changing contains method behavior. */
private fun <T> Collection<T>.safeToConvertToSet() = size > 2 && this is ArrayList
/** Converts this collection to a set, when it's worth so and it doesn't change contains method behavior. */
internal fun <T> Iterable<T>.convertToSetForSetOperationWith(source: Iterable<T>): Collection<T> =
when (this) {
is Set -> this
is Collection ->
when {
source is Collection && source.size < 2 -> this
else -> if (this.safeToConvertToSet()) toHashSet() else this
}
else -> toHashSet()
}
When 'this' is a persistent list, it is a collection but not an array list, so safeToConvertToSet() returns false and we don't do the conversion to hash set.
This is only an analysis, I don't have any solution for now.