aiida-core icon indicating copy to clipboard operation
aiida-core copied to clipboard

Added a `remotecat` command for printing remote file of running calcjobs

Open zhubonan opened this issue 3 years ago • 11 comments

This is a command that I found useful - perhaps it is good to commit it to the core. It can be useful for monitoring the calcjob while they are running.

I have not yet able to add tests for it - the existing test framework for calcjob commands does not include remote contents (the nodes are extracted from an archive). Not sure how to make the test work, perhaps the only way is to mock the getfile method of the RemoteData node?

zhubonan avatar Apr 17 '21 15:04 zhubonan

Hey @zhubonan , thanks for the contribution! We are currently doing/discussing some changes in how we are handling the connections for external resources (in particular, ssh connections), so we need to consider how this feature fits into all that. Given that it is a command line feature, it might still be useful to have something like it but perhaps the underlying workings might need to be different.

I'm pinging @sphuber and @giovannipizzi if they can take a look. Anyways, I'll try to bring the subject up on the next AiiDA meeting (it is this wednesday at 15:00 CEST, these are open for participation so let me know if you want to be present).

ramirezfranciscof avatar Apr 19 '21 09:04 ramirezfranciscof

I see. This is just a small feature that I think might be useful. It does not touch the ssh interface though (only RemoteData.getfile is used).

Feel free to come back to this later.

zhubonan avatar Apr 19 '21 09:04 zhubonan

+1 I'd definitely use this feature. I'm wondering if it can also be extended to monitor the calculation. I typically do this with the tail command, after doing verdi calcjob gotocomputer:

$ tail -f aiida.out 

It would be nice to just be able to do e.g.:

verdi calcjob remotecat --monitor <PK>

mbercx avatar Apr 19 '21 09:04 mbercx

Hi @zhubonan , thanks for this contribution!

have not yet able to add tests for it - the existing test framework for calcjob commands does not include remote contents (the nodes are extracted from an archive).

You could add a test to the "system tests", some of which run calculations on a slurm docker container on github actions. See e.g. https://github.com/aiidateam/aiida-core/blob/fa8b24379f5731cef43ca1648d18142d0c55de97/.github/system_tests/pytest/test_memory_leaks.py#L48-L62

I might not be the right person to review this PR but we'll discuss it at the meeting

ltalirz avatar Apr 19 '21 10:04 ltalirz

It does not touch the ssh interface though (only RemoteData.getfile is used).

Right, so this is one of the things. The RemoteData methods open and close the ssh connection with the external resource every time you call them. We are considering changing this so that you need to first open the transport yourself and then pass the open connection to the RemoteData method you want to use (this is slightly more complicated but would allow users to systematically copy files in a more efficient way that may currently be prohibiting for some use cases). This might require some adaptation of this feature; I just want to make sure we are considering all angles before going through since this is related to a part of the code that is being re-designed.

ramirezfranciscof avatar Apr 19 '21 10:04 ramirezfranciscof

I think that in principle it would be ok to add this command even though we are not yet sure of how the transports may change, since if and when we change it, it should not affect the interface and so the user shouldn't be affected.

That being said: in a sense this command already exists in verdi data remote cat. Now I realize that this command provides some shortcuts as in it automatically gets the PK of the remote_folder and it can get the default output name if the plugin specifies it. So there might be an argument to add this anyway, but then i would consider having the implementation be as close to the other as possible, or at least use the implementation under the hood.

sphuber avatar Apr 19 '21 12:04 sphuber

That being said: in a sense this command already exists in verdi data remote cat. @sphuber I see. I will reflector the code to use the implement that verdi data remote cat uses.

+1 I'd definitely use this feature. I'm wondering if it can also be extended to monitor the calculation. I typically do this with the tail command, after doing verdi calcjob gotocomputer:

@mbercx I have coded this up as well actually based on gotocomputer. The RemoteData interface does not allow tracking a file easily. I will see if I can add it to the PR as well.

zhubonan avatar Apr 19 '21 13:04 zhubonan

Codecov Report

Merging #4861 (9082559) into develop (68df47e) will increase coverage by 6.60%. The diff coverage is 64.29%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4861      +/-   ##
===========================================
+ Coverage    73.50%   80.10%   +6.60%     
===========================================
  Files          515      515              
  Lines        36665    36742      +77     
===========================================
+ Hits         26947    29427    +2480     
+ Misses        9718     7315    -2403     
Flag Coverage Δ
django 74.58% <64.29%> (?)
sqlalchemy 73.50% <64.29%> (+0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/cmdline/commands/cmd_calcjob.py 66.50% <60.00%> (-2.30%) :arrow_down:
aiida/transports/plugins/local.py 81.41% <100.00%> (ø)
aiida/transports/plugins/ssh.py 80.14% <100.00%> (ø)
aiida/transports/transport.py 66.22% <100.00%> (ø)
aiida/orm/utils/serialize.py 100.00% <0.00%> (ø)
aiida/tools/importexport/dbimport/backends/sqla.py 93.64% <0.00%> (+0.03%) :arrow_up:
...ida/tools/importexport/dbimport/backends/common.py 97.34% <0.00%> (+0.20%) :arrow_up:
aiida/manage/tests/pytest_fixtures.py 93.34% <0.00%> (+0.28%) :arrow_up:
aiida/schedulers/plugins/sge.py 70.30% <0.00%> (+0.42%) :arrow_up:
aiida/manage/tests/__init__.py 88.54% <0.00%> (+0.92%) :arrow_up:
... and 80 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 68df47e...9082559. Read the comment docs.

codecov[bot] avatar Apr 19 '21 13:04 codecov[bot]

@ltalirz @sphuber I have updated the code and added a test. This PR is ready to be merged now.

zhubonan avatar Jun 15 '21 22:06 zhubonan

Hey thanks @zhubonan, I'm seeing a lot of lines not covered by tests 😬 would it be possible to look into increasing the test coverage for these scenarios?

image

chrisjsewell avatar Jun 16 '21 06:06 chrisjsewell

@chrisjsewell I tried to improve the coverages but it turns out additional changes are needed to support the --monitor option for multiple transport plugins. I have tweaked the gotocomputer_command method to allow additional arguments to be passed to facilitate running tail -f <filename>. However, this command cannot be tested directly as the os.system will never return I think, just like that for the standard gotocomputercommand.

zhubonan avatar Jun 22 '21 13:06 zhubonan