sdset
sdset copied to clipboard
[Feature] Streaming set operations via .from_iters() and .to_iter() functions
Needs
I'd like to create from_iters() and to_iter() versions of each operation. This would lazily evaluate/produce each member of the sequences. This would not be as performant as the current exponential-search version when it comes to working slice-to-slice, but would allow for a few additional use cases:
- Reading elements from an external source (perhaps a database query known to be sorted via UNIQUE + ORDER BY) using an asynchronous row iterator instead of needing to construct a Vec first.
- Checking if the result of an operation (or series of operations) has any members (via the Iterator::any() method), without needing to materialize the entire result. For this operation, this would be a big performance improvement.
- Perform a map() or filter() operation on each element of a produced result, and then insert the result into a HashMap or other data structure, without needing an intermediate Vec.
Proposed implementation details:
They should be compatible with the existing slices -> slice API. This means introducing 3 new dataflow paths: slices -> iter, iters -> slice, and iter -> iter.
Proposed Syntax:
let a: SetBuf<i32> = SetBuf::new_unchecked((0..1_000_000).collect());
let b: SetBuf<i32> = SetBuf::new_unchecked((0..1_000_000).collect());
// slices -> slice (current syntax)
let inter: SetBuf<i32> = sdset::duo::OpBuilder::new(&a, &b).intersection().into_set_buf();
// slices -> iter
let inter: SetBuf<i32> = sdset::duo::OpBuilder::new(&a, &b).intersection().into_iter();
// iters -> slice
let inter: SetBuf<i32> = sdset::duo::OpBuilder::from_iters(&a.iter(), &b.iter()).intersection().into_set_buf();
// iters -> iter
let inter: SetBuf<i32> = sdset::duo::OpBuilder::from_iters(&a.iter(), &b.iter()).intersection().into_iter();
To implement this, the existing Union/Intersection/Difference/SymmetricDifference types would need to be extended with an into_iter() function that would lazily consume the input slice. This would then cover the slices -> slice and slices -> iter case.
New types UnionOfIters/IntersectionOfIters/DifferenceOfIters/SymmetricDifferenceOfIters would need to be created whose a
and b
input fields are &'a mut dyn Iterator<T>
. These would then have the into_set_buf() and into_iter() functions implemented, covering the iters -> slice and iters -> iter case.