future icon indicating copy to clipboard operation
future copied to clipboard

BACKEND: ClusterMQ as a new backend

Open alexvorobiev opened this issue 6 years ago • 9 comments

I have recently discovered ClusterMQ which can run R code in SLURM/LSF/etc. jobs. The biggest advantage over batchtools is it uses ZMQ to transfer data directly to the distributed jobs. In my experience the most serious bottleneck in batchtools is using shared file system (NFS) for data transfer - especially if the data is large.

alexvorobiev avatar Mar 10 '18 04:03 alexvorobiev

Yes, @mschubert's ClusterMQ is a great candidate for a future backend. I don't have the resources myself right now to work also on that. Having said that, and without having worked with ClusterMQ myself, I don't think it should be too much work to wrap it all up in a ClusterMQFuture - a future backend is mostly a thin layer on top of an existing API.

Related: I'm working on setting up a conformation test suite (e.g. future.tests pkg) that can be used by all future backend pkgs to make sure they got it correct. That is my number one priority before working on new backends.

HenrikBengtsson avatar Mar 10 '18 21:03 HenrikBengtsson

I fully support this, but unfortunately my time is also quite limited these days.

mschubert avatar Mar 10 '18 22:03 mschubert

Given that clustermq::Q() is synchronous, I am wondering what it would take to make an asynchronous ClusterMQFuture. Do we need local background processes to collect the results?

wlandau avatar Jun 28 '18 16:06 wlandau

Will future.clustermq somehow allow for heterogeneous transient workers? Some drake users such as @jennysjaarda prefer transient future-based workers over persistent clustermq-based workers, e.g. https://github.com/ropensci/drake/issues/1083#issuecomment-564941327, but there is still the snag that batchtools is slower than clustermq.

wlandau avatar Dec 12 '19 14:12 wlandau

Yes, this would be great if it somehow clustermq could allow for transient workers!

jennysjaarda avatar Dec 17 '19 07:12 jennysjaarda

May I ask what the status of the backend is? Is it still planned to include clustermq as a backend to future or is there already a way to get that functionality via some workaround.

clustermq is quite a bit more efficient as pointed out in this thread and is thus very interesting for cluster usage.

wds15 avatar Aug 12 '20 15:08 wds15

Still on my wishlist to get to, so, yes, certainly on the todo list. Resources/time is the limiting factor. Indirectly, a big step forward has actually been made since automatic validation of new backends is now in place, cf. future.tests.

PS. I invite anyone to have a look at the very rudimentary first prototype future.clustermq and see if they can give it a push forward (PRs welcome).

HenrikBengtsson avatar Aug 12 '20 17:08 HenrikBengtsson

@HenrikBengtsson When I wanted to check out the future.clustermq link I got a 404.

Is this still set to private?

mschubert avatar Oct 14 '20 14:10 mschubert

It would be awesome to get this going. I just made it public, but please note that it's very rudimentary/prototypical and I have not touched it for a very long time.

HenrikBengtsson avatar Oct 16 '20 20:10 HenrikBengtsson