aiida-core
aiida-core copied to clipboard
👌 Allow copying between different computers with same `hostname`
Is your feature request related to a problem? Please describe
In some cases, I want to use two different schedulers on the same remote, e.g. hyperqueue
for small jobs that I want to run on partial nodes, but Slurm
for bigger jobs where I need multiple nodes and a solid chunk of walltime. Currently, this means I have to set up two computers with different schedulers. However, if one calculation needs to copy/symlink files from a previous one run on a different scheduler (i.e. computer), this currently fails with a NotImplementedError
since the execmanager
compares the computer UUIDs:
https://github.com/aiidateam/aiida-core/blob/8a2fece02411c982eb16e8fed8991ffaf75fa76f/aiida/engine/daemon/execmanager.py#L250-L272
Describe the solution you'd like
One solution that I've been running with locally is to compare the hostname
of the computers instead, which seemed sensible at first glance. There may be certain cases where this breaks, however?
Describe alternatives you've considered
It's clear that a computer can be used with multiple schedulers. Besides the hyperqueue
case, you might want to run e.g. an aiida-shell
job directly on the login node. Instead of setting up multiple computers, maybe a computer can be configured with multiple schedulers with one the default and the others can be used by setting an option?
Additional context
Related to https://github.com/aiidateam/aiida-core/issues/5084
Note: This is also important if you e.g. share a work chain with another user and this work chain has files stashed on the remote that need to be copied for a next step.