streamz icon indicating copy to clipboard operation
streamz copied to clipboard

Quickstart lacks conda/environments/streamz_dev.yml

Open gdmcbain opened this issue 1 year ago • 6 comments

Having cloned this repo (master at b4f0450586), I failed to run the Quickstart; viz., running

cd ~/src/streamz
sh docker/build.sh

I get

EnvironmentFileNotFound: '/streamz/conda/environments/streamz_dev.yml' file not found.

Looking inside

https://github.com/python-streamz/streamz/blob/b4f0450586f5de40a2cd1232270db7d86fc00176/docker/build.sh#L3

and then

https://github.com/python-streamz/streamz/blob/b4f0450586f5de40a2cd1232270db7d86fc00176/Dockerfile#L19

I'm guessing that's because there's no conda/ subdirectory in the repo?

I see that this conda/ subdirectory is also referred to in the contributing guidelines for this issue-tracker.

gdmcbain avatar Sep 10 '23 23:09 gdmcbain

Thank you for having a look at this repository! It has been quite some time since I or anyone can contributed here, unfortunately. I don't remember where the environment file or conda/ dir might have gone.

The requirements are pretty minimal, however, and you should be able to get up and running. There are optional connectors to pandas, dask, hvplot and some more niche things, but you can play without those.

martindurant avatar Sep 11 '23 17:09 martindurant

O. K., thanks. I have indeed been able to get up and running. A somewhat stripped back Dockerfile &c. (minus Kafka, conda, wget, …) is in https://github.com/gdmcbain/streamz/tree/469-quickstart. With it, I was able run the first couple of Jupyterlab notebooks I looked at (iterators_and_streamz, fibonacci). Very nice. Thank you!

gdmcbain avatar Sep 12 '23 11:09 gdmcbain

I do think that streamz is cool and could be very useful, but it doesn't fit into most people's conception of data processing. Let us know if you do something interesting with it!

martindurant avatar Sep 12 '23 14:09 martindurant

What I've got in mind (and thank you, @amotl, for introducing your application in #470, that could be very useful too) is numerical simulation of dynamical systems, something along the lines of

I've been doing this for a while using first itertools then itertoolz, but then exactly as addressed in Why not Python generator expressions? (i.e., the raison d'être of Streamz)

this quickly become cumbersome, especially when building complex pipelines.

Where I got the idea of looking to Streamz was the stream interface of Scikit FiniDiff (which is itself reasonably active and I thnk does use streamz in a fairly integral way).

gdmcbain avatar Sep 13 '23 00:09 gdmcbain

Interesting, thank you. Do I understand that you don't use realtime events (from some external stimulus) at all, but push events into the stream? In that case, streamz is providing a handy visualisable branching/ pipelining solution, right?

martindurant avatar Sep 14 '23 13:09 martindurant

No realtime stimuli, no, it's all offline simulation. There are sometimes external stimuli. In the language of dynamical systems, some systems are autonomous, which means that they just evolve according to their own internal law; mathematically their differential or difference equation doesn't explicitly involve time, say f(x,dx/dt)=0. The others are nonutonomous, so say f(t,x,dx/dt)=0.

The distinction is a bit blurry because one can always take time t to be just another degree of freedom in x which evolves at constant unit rate, but if the structure of the model is to represent a real physical system with inputs or excitation, the distinction can be meaningful.

So yes, pythonic generators are a pretty good fit but what I'm thinking might be even better is, as you say:

a handy visualisable branching/ pipelining solution

gdmcbain avatar Sep 14 '23 21:09 gdmcbain