aiida-core icon indicating copy to clipboard operation
aiida-core copied to clipboard

👌 Allow copying between different computers with same `hostname`

Open mbercx opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe

In some cases, I want to use two different schedulers on the same remote, e.g. hyperqueue for small jobs that I want to run on partial nodes, but Slurm for bigger jobs where I need multiple nodes and a solid chunk of walltime. Currently, this means I have to set up two computers with different schedulers. However, if one calculation needs to copy/symlink files from a previous one run on a different scheduler (i.e. computer), this currently fails with a NotImplementedError since the execmanager compares the computer UUIDs:

https://github.com/aiidateam/aiida-core/blob/8a2fece02411c982eb16e8fed8991ffaf75fa76f/aiida/engine/daemon/execmanager.py#L250-L272

Describe the solution you'd like

One solution that I've been running with locally is to compare the hostname of the computers instead, which seemed sensible at first glance. There may be certain cases where this breaks, however?

Describe alternatives you've considered

It's clear that a computer can be used with multiple schedulers. Besides the hyperqueue case, you might want to run e.g. an aiida-shell job directly on the login node. Instead of setting up multiple computers, maybe a computer can be configured with multiple schedulers with one the default and the others can be used by setting an option?

Additional context

Related to https://github.com/aiidateam/aiida-core/issues/5084

mbercx avatar Sep 24 '23 21:09 mbercx

Note: This is also important if you e.g. share a work chain with another user and this work chain has files stashed on the remote that need to be copied for a next step.

mbercx avatar Nov 29 '23 22:11 mbercx