deno_std icon indicating copy to clipboard operation
deno_std copied to clipboard

tracking: support iterables in `@std/collections` APIs

Open iuioiua opened this issue 1 year ago • 6 comments

Following on from #4671, it seems there are APIs within @std/collections that could support Iterables rather than just arrays. From a quick scan, these symbols might be able to be tweaked to support iterables:

  • [x] chunk()
  • [ ] dropLastWhile()
  • [ ] dropWhile()
  • [ ] intersect()
  • [x] sample()
  • [ ] slidingWindows()
  • [ ] sortBy()
  • [ ] takeLastWhile()
  • [x] takeWhile()
  • [x] withoutAll()

See #3390 for an example of how with was done for groupBy().

iuioiua avatar Jul 17 '24 22:07 iuioiua

A larger change to consider. Should collections use generators? This would be breaking change.

Generators give lazy evaluation, memory efficiency, and sometimes better performance

Example:

function* takeWhile<T>(
  iterable: Iterable<T>,
  predicate: (el: T) => boolean,
): Generator<T> {
  for (const element of iterable) {
    console.log(`takeWhile(): ${element}`);
    if (!predicate(element)) {
      break;
    }
    yield element;
  }
}

function* map<T, U>(
  iterable: Iterable<T>,
  transform: (el: T) => U,
): Generator<U> {
  for (const element of iterable) {
    console.log(`map(): ${element}`);
    yield transform(element);
  }
}

const arr = [1, 2, 3, 4, 5, 6, 7, 8, 9];
const result = takeWhile(map(arr, (n) => n * 100), n => n < 500)
console.log([...result])

Liam-Tait avatar Sep 07 '24 02:09 Liam-Tait

I think we should stick to the current approach unless the performance benefits are sufficient. We'd have to measurements before coming to that decision.

iuioiua avatar Sep 09 '24 03:09 iuioiua

The move to iterables means that generators that never return are valid arguments.

  function* counter() {
    let i = 0;
    while (true) {
      yield i++;
    }
  }

Many of these functions would hang

Examples:

takeWhile(counter(), v === 0) // predicate will never be true
chunk(counter(), 10)
dropLastWhile(counter, v => v > 30)
dropWhile(counter, v => v === 0) // predicate will never be true

Here are some options:

  1. Don't handle the case, onus is on the user to know the generator is infinite. Likely case is throwing an Uncaught RangeError: Invalid array length when the result array is too long
  2. Catch the error and rethrow?
  3. Have a max iterations option available

Liam-Tait avatar Sep 09 '24 10:09 Liam-Tait

@iuioiua Would you like to assign this to me, I am happy to commit myself to completing all of these changes

Liam-Tait avatar Sep 22 '24 09:09 Liam-Tait

Went with Option 1 Don't handle the case where generators are infinite, onus is on the user to know this

Liam-Tait avatar Sep 23 '24 13:09 Liam-Tait

I think that is the correct option.

timreichen avatar Sep 23 '24 13:09 timreichen

I believe this is now completed (unstable)

Liam-Tait avatar Oct 16 '24 05:10 Liam-Tait

@Liam-Tait Thank you for your contributions! Closing as completed

kt3k avatar Oct 16 '24 05:10 kt3k