Add subtract to IterableExtensions
Considering views this stackoverflow link has had, https://stackoverflow.com/q/3428536 (637k times atm) I think we can agree list subtract is a popular need for developers. It can have bad implementations like list(set(x) - set(y)) which results to removal of duplicated items or O(m * n) implementations like this https://stackoverflow.com/a/20259489 so as it's easy to write an incorrect implementation I believe we can implement a correct and definite implementation here. I went through mistakes of having incorrect implementations and ruining some app releases with the mistakes and now I like to share my implementation here so others don't go through issues I had.
As this output an iterable and can work as an iterable extension, I move moved it to IterableExtension, or maybe I should revert that, or I will apply any other combination you would suggest.
Considering views this stackoverflow link has had, https://stackoverflow.com/q/3428536 (637k times atm) I think we can agree list subtract is a popular need for developers.
I don't always agree that developers know what they need. What they want isn't always what they need.
This operation is inherently two operations:
- Count occurrences in the second operand
- filter-with-count on the first operand.
I'd prefer to provide both operations, separately
extension <E> on Iterable<E> {
/// Counts the elements of this iterable, and updates a map with the result.
///
/// Creates a new map, or starts with [accumulator] if provided.
/// For each element of this iterable, the current value of the element in the map, with a default of zero if
/// the element doesn't have an entyr in the map, is increased by one.
/// Returns the updated map.
///
/// If no map is provided, the default map is a default Dart hash-based map using [Object.==] for equality.
/// That means that distinct-but-equal objects are only represented by one of their elements in the resulting
/// map.
/// Provide a custom map if a different equality is needed.
Map<E, int> countOccurrences([Map<E, int>? accumulator]) {
accumulator ??= {};
for (var element in this) accumulator.update(e, _inc, ifAbsent: _one);
return accumulator;
}
static int _inc(int v) => v + 1;
static int _one() => 1;
/// Elements of this iterable, skipping elements while they have positive [removeCount].
///
/// Iterating the returned iterable will iterate this iterable and emit the same elements, except that
/// for elements that have a positive value in the [removeCount] map, the element is skipped and the
/// value decremented in the map. Non positive values are removed from the map.
///
/// After iteration, the updated map may still contain counts for elements where there weren't
/// as many elements in this iterable as could have been skipped.
///
/// The elements of this iterable are looked up in the provided map using normal map operations,
/// and therefore uses the same equality as the map.
Iterable<E> removeCounted(Map<E, int> removeCount) sync* {
for (var element in this) {
var count = removeCount[element];
if (count != null) {
count -= 1;
if (count > 0) {
removeCount[element] = count;
} else {
removeCount.remove(element);
}
if (count >= 0) continue;
}
yield element;
}
}
}
Then we can also add an
Iterable<E> removeCountedOccurrences(Iterable<E> remove, [Map<E, int>? accumulator]) =>
removeCounted(remove.countOccurrences(accumulator));
which does both things for you, but it's understandable as the combination of those two separate and indpendently useful operations, it doesn't hide that it creates a map and uses it.
It feels wrong to me to have an operation that treats its Iterable input as a multiset, and always converts it to one, and doesn't just ask for a multiset as argument. An iterable is-not a multiset, it has horrible performance if you treat it as one directly, which is also why the first step here is to convert it to one.
Being able to see the remove-count being decremented, and see how much was not used, might actually be useful in some cases. Or controlling the equality used may also be useful, which is only possible if you admit that there is a Map involved.
This operation is inherently two operations:
- Count occurrences in the second operand
- filter-with-count on the first operand.
[..] it doesn't hide that it creates a map and uses it.
I agree with you, I just am wondering maybe count occurrence part can be shaped around ideas of python's Counter interface (a dedicated class instead of Map<T, int>) if that is not something we want here I think I can continue with your already provided code if you don't want to push it yourself.
Closing as the dart-lang/collection repository is merged into the dart-lang/core monorepo. Please re-open this PR there!