Iterator Support for large globs
Hi
Consider a method where the return object could be an Iterator or AsyncIterator so that large file globs (as in huge numbers of files) are supported.
Is this to avoid blocking the main thread and what would such implementation look like?
Theres a few things involved here @terkelg
There are better implementations then what I did, this just so happened to have been fine for my use case.
In my local implementations, I used fast-glob and adapted a stream into an async iterator using the stream-to-async-iterator library.
Concerns:
- async iteration isn't standardised by TC39 yet
- libuv doesn't quite support it yet, and thus node readdir doesn't either.
- do we need a sync iterator? Is it beneficial or even possible? (sync iterators are already standardised)
Notes:
- From Node 10 - anywhere where streams are supported we can support async iteration
- streams can be adapted into async iterators
Usage:
import fglob from 'fast-glob';
import StreamIteratorAdapter from 'stream-to-async-iterator';
async function process(dirglob ,globOptions) {
const stream = fglob.stream(dirglob, globOptions),
iterator = new StreamIteratorAdapter(stream);
for await (const stat of iterator) {
// processes items individually allowing us to handle massive glob lists without hitting resource limits
}
}
Thanks for elaborating. This seems a bit complex. Is it possible add as an extension/wrapper around tiny-glob?
If you can provide a stream option it will allow this use case, and with time it will get more elegant. We can't do any higher level iteration if we don't have some async way of processing entries with backpressure.
Usually from the implementations I've seen this comes back to readdir,
there are some packages that provide this (like fast-glob) and digging
through their code it looks like they use readdir-enhanced which via some
magic (I didn't look into that code) manages to provide a stream.
But to make it easier for you I'd wrap the stream and just use and expose an async iterator via generator functions, otherwise you have to do a bunch of stream handling and thats error prone and painful.
On Tue, 6 Nov 2018, 11:56 Terkel, [email protected] wrote:
Thanks for elaborating. This seems a bit complex. Is it possible add as an extension/wrapper around tiny-glob?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/terkelg/tiny-glob/issues/30#issuecomment-436195302, or mute the thread https://github.com/notifications/unsubscribe-auth/AEF36NWuATG6PdCdALz5p-qaXMmirQ1iks5usVzagaJpZM4XXdvf .