streamz
streamz copied to clipboard
Filter DaskStream
Would it be possible to have a filter
DaskStream
? I'm happy to put in a PR but I don't know where to start.
Does it mean that we filter out futures or that we produce a stream of futures that might be null? It seems ambiguous to me.
On Wed, Aug 8, 2018 at 5:06 PM Christopher J. Wright < [email protected]> wrote:
Would it be possible to have a filter DaskStream? I'm happy to put in a PR but I don't know where to start.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mrocklin/streamz/issues/195, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszEXHFBozePUn9kreZ47CJCuyj3xqks5uO1LcgaJpZM4V0qeT .
I don't know if we can filter out futures without interrogating them (and thus needing to be greedy). I was thinking about a stream which contained null/no-op values.
Should that be called filter or are there other things (like producing a reduced stream of futures after interrogating them) that would have as much of a right to that name? If it's ambiguous then I'm not sure how best to proceed.
On Wed, Aug 8, 2018 at 5:10 PM Christopher J. Wright < [email protected]> wrote:
I don't know if we can filter out futures without interrogating them (and thus needing to be greedy). I was thinking about a stream which contained null/no-op values.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mrocklin/streamz/issues/195#issuecomment-411553492, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszFekYQIwHrqnGcveIS45Dx-fjxoLks5uO1PPgaJpZM4V0qeT .
We could call it something else, I'm not too attached to the name. I'm mostly interested in functionality similar to what we have in the core streamz module, where values filtered out are not computed against for downstream nodes. I don't think we can take the approach of core streamz, since we'd need to read the result of the future to decide if it should be emitted or not. But maybe a no-op would have a similar effect, where the data is emitted but produces no output.
I wonder if this is just map, but with custom user code. You would have to make decisions about how to represent N/A data and such.
On Wed, Aug 8, 2018 at 5:14 PM Christopher J. Wright < [email protected]> wrote:
We could call it something else, I'm not too attached to the name. I'm mostly interested in functionality similar to what we have in the core streamz module, where values filtered out are not computed against for downstream nodes. I don't think we can take the approach of core streamz, since we'd need to read the result of the future to decide if it should be emitted or not. But maybe a no-op would have a similar effect, where the data is emitted but produces no output.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mrocklin/streamz/issues/195#issuecomment-411554691, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszPOwXxV4LNjwy9MuLw8QaWo5GSNPks5uO1TBgaJpZM4V0qeT .
Maybe, although it seems like an undue burden on the user code to handle DaskStream
specific filter logic. Maybe there is a middle layer between the two.
My experience is that when faced with ambiguity one should resist the urge to choose and defer to the user, keeping core scope small. This is only when making infrastructural libraries though, for concrete applications it's a lot easier to be opinionated. I totally get where you're coming from.
On Wed, Aug 8, 2018 at 5:26 PM Christopher J. Wright < [email protected]> wrote:
Maybe, although it seems like an undue burden on the user code to handle DaskStream specific filter logic. Maybe there is a middle layer between the two.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mrocklin/streamz/issues/195#issuecomment-411557857, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszLKaqJfBG9pAbL0oK3a2Pvgy34Kyks5uO1d0gaJpZM4V0qeT .