PyAutoFEP icon indicating copy to clipboard operation
PyAutoFEP copied to clipboard

Add jupyter notebooks describing the use of PyAutoFEP functions to the docs

Open tbwxmu opened this issue 3 years ago • 15 comments

Thanks for sharing your nice works, could you create a jupyter notebook to describe how to use your scripts? It will be a great help for new guys to PyAutoFEP

tbwxmu avatar Aug 03 '21 07:08 tbwxmu

PyAutoFEP was originally designed to be used as stand alone scripts, not as an API, although, sure, one can import the modules and call it's functions. I believe that, for regular use, the manual (https://github.com/luancarvalhomartins/PyAutoFEP/blob/master/docs/Manual.pdf) and the recently upload tutorial (https://github.com/luancarvalhomartins/PyAutoFEP/tree/master/docs/tutorial01) are better start points than a notebook would be.

What exactly are you trying to do? I guess I can try writing a Jupyter NB using the PyAutoFEP functions for specific cases.

luancarvalhomartins avatar Aug 05 '21 21:08 luancarvalhomartins

Thanks for the detailed tutorial, however I encountered with gromacs error during running prepare_dual_topology.py:

================ Working on pairs ================
Perturbation Pose Coordinates
============ Working on FXR_12-FXR_74 ============
FXR_12 -> FXR_74
================ Building system =================
[ERROR] Failed to run gmx grompp. Error code 1.
(.......)
[ERROR] Fatal error:
[ERROR] number of coordinates in coordinate file
[ERROR] (/tmp/tmpgovsu8on/tutorial/FXR_12-FXR_74/protein/build_system_180336_07082021/systemsolvated_step4_180336_07082
021.gro, 44742) does not match topology
[ERROR] (/tmp/tmpgovsu8on/tutorial/FXR_12-FXR_74/protein/build_system_180336_07082021/systemsolv_step4_180336_07082021.
top, 3930)

Environment:

  • pyautofep: latest commit (ce9096)
  • openbabel: (not used as I keep original input file in workdir.tgz)
  • gromacs: 2021.2

Please help with the problem and any suggestions are welcome. Thanks!

jackzzs avatar Aug 07 '21 10:08 jackzzs

Thanks for the comment and the report.

I have been told about this bug before but I could not reproduce or solve it, as it seem to be very system-specific. It seems to be related to a filesystem delay/buffering when editing files in /tmp. Is your /tmp a tmpfs RAM disk? Are you running prepare_dual_topologies.py in a disk-less node?

Would you mind retrying with --output_hidden_temp_dir=True? This will force the use the $PWD instead of /tmp. If this solves the issue for you, maybe it should be the default.

luancarvalhomartins avatar Aug 07 '21 16:08 luancarvalhomartins

I retried running prepare_dual_topology.py with --output_hidden_temp_dir=False and no error.

I ran the script on a workstation, and /tmp is mounted on a hard disk to allow large temp files, which may have caused the problem you described.

Thanks!

jackzzs avatar Aug 09 '21 07:08 jackzzs

@jackzzs Thanks for testing and confirming. I am filling a separated bug report for the number of coordinates in coordinate file does not match topology error.

@tbwxmu should you have any specific suggestion for a use of a Jupyter NB, please, let me know. I agree that having some example notebooks would be useful to demonstrate how to use PyAutoFEP as an API, but mainly in particular use cases. I am keep this issue open to keep track of this idea.

luancarvalhomartins avatar Aug 09 '21 11:08 luancarvalhomartins

I would like to ask a question regarding the above problem. I ran tutorial01 the other day. If I comment out all the following lines as the default, it works fine, but if I change #output_scripttype = slurm to output_scripttype = bash, it stops working. The error message is the same as the above comment (number of coordinates in coordinate file does not match topology error).

Sets the path to GROMACS executable in the run node, uncomment and modify if needed. gmx_bin_run = /home/gromacs-2020.6/bin

”# Options controlling the output, see manual for more info. Uncomment as needed "# FEP legs are to be submitted to a slurm scheduler. "# output_scripttype = slurm "# Run these commands at the beginnig of the jog (useful to load modules, importing libs). "# output_runbefore = module load python3; module load cuda "# Fine-tune job resources "# output_resources = all_cpus:24; all_gpu:2; all_times:24 "# Use a python file instead of a binary during the collect step "# output_collecttype = python

This should be the same as the default setting, but it doesn't seem to work. Is there any difference? I would appreciate your comments.

baba-hashimoto avatar Sep 02 '21 05:09 baba-hashimoto

I guess that's unrelated to output_scripttype, but rather an intermittent error. The code that prepares the water and complex legs is the same regardless of which output_scripttype is being used. Does using --output_hidden_temp_dir=False solves your issue? Also, did PyAutoFEP print a message suggesting you to do that? It should have, I added that in 24369b3d3bf01a03cc3c7c0196480a851e2a0298

luancarvalhomartins avatar Sep 02 '21 23:09 luancarvalhomartins

Thank you for your quick reply Here are command python ../../prepare_dual_topology.py --config_file=step2.ini --output_hidden_temp_dir=False

Here are the errors we got. [ERROR] This is likely caused by a failing to edit the intermediate topology file /work113/test/PyAutoFEP/docs/tutorial01/tutorial [ERROR] [ERROR] [ERROR] output_scripttype = bash/FXR_12-FXR_74/protein/build_system_101302_03092021/systemsolv_step4_101302_03092021.top. Rerunning with output_hidden_temp_dir=False may solve this issue. =================== STACK INFO =================== File "../../prepare_dual_topology.py", line 3290, in solvate_data=solvate_data, verbosity=arguments.verbose) File "../../prepare_dual_topology.py", line 480, in prepare_complex_system msg_verbosity=os_util.verbosity_level.error, current_verbosity=verbosity) File "/work113/test/PyAutoFEP/os_util.py", line 292, in local_print formatted_string += '\n{:=^50}\n{}{:=^50}'.format(' STACK INFO ', ''.join(traceback.format_stack()), =================== STACK INFO =================== I added the option ”--output_hidden_temp_dir=False”, but it did not work.  The support message is as follows. [ERROR] output_scripttype = bash/FXR_12-FXR_74/protein/build_system_101302_03092021/systemsolv_step4_101302_03092021.top. Rerunning with output_hidden_temp_dir=False may solve this issue.

By the way, when I ran the original file with "output_scripttype = bash" commented out, it worked without any problems. I would like to use the job scheduler if possible, so it would be nice if output_scripttype is available.

If you have any solutions, I would appreciate it if you could let me know.

baba-hashimoto avatar Sep 03 '21 01:09 baba-hashimoto

Is the following line part of the error message?

[ERROR] output_scripttype = bash/FXR_12-FXR_74/protein/build_system_101302_03092021/systemsolv_step4_101302_03092021.top.

If it does, would you mind double checking your input file? Including the new line markings? In case its fine, would you mind opening a issue with the full log (or, in case it may contain sensitive data, fell free to contact me by email).

luancarvalhomartins avatar Sep 03 '21 23:09 luancarvalhomartins

I'm sorry for the late response. I have saved the results of running the program in the following link, please check them. The same data is used in both cases. The only difference is that the output_scripttype is commented out or not.

https://github.com/baba-hashimoto/PyAutoFEP/tree/master/docs

The file names are tutorial01_sample1 and tutorial01_sample2.tar.gz. tutorial01_sample1 contains only the results of FXR_12-FXR_74 (due to the limitation of the number and size of the files). Sample1 worked fine, sample2 stopped in the middle. If you need files other than FXR_12-FXR_74, please inform me.

baba-hashimoto avatar Sep 06 '21 02:09 baba-hashimoto

@baba-hashimoto Sorry for the late reply.

Ok, two things. First, your perturbations_dir contains /, likely because you want to specify a full output path. Although this is not something I anticipated when coding, it seems to work fine, so I am assuming that this is not the problem. Make sure the destination dir exists.

The actual problem is quite simpler that I tough. In .ini files (as read by configparser (see: https://docs.python.org/3/library/configparser.html)), lines starting with spaces are treated as multiline values:

[Multiline Values]
chorus = I'm a lumberjack, and I'm okay
    I sleep all night and I work all day

So your step2.ini should read:

# Options controlling the output, see manual for more info. Uncomment as needed
# FEP legs are to be submitted to a slurm scheduler
output_scripttype = bash # <<<< Note the absence of leading space
# Run these commands at the beginnig of the jog (useful to load modules, importing libs)
# output_runbefore = module load python3; module load cuda
# Fine-tune job resources
# output_resources = all_cpus:24; all_gpus:2; all_time: 24
# Use a python file instead of a binary during the collect step
# output_collecttype = python

luancarvalhomartins avatar Sep 11 '21 12:09 luancarvalhomartins

The actual problem is quite simpler that I tough. In .ini files (as read by configparser (see: https://docs.python.org/3/library/configparser.html)), lines starting with spaces are treated as multiline values:

The point above was correct. I'm not very familiar with python, so I didn't notice it. Clearing the space works fine. I apologize for the inconvenience.

baba-hashimoto avatar Sep 13 '21 01:09 baba-hashimoto

Thank you for your help last time. This time, when I set output_scripttype to pbs, I got the following error. When output_scripttype is slurm, it worked fine. I am writing the input file after the error message, so please let me know if there is a problem.

”================= Perturbations ================== State A State B FXR_12 FXR_74 FXR_12 FXR_76 FXR_12 FXR_84 FXR_12 FXR_85 FXR_12 FXR_88 ”================================================== ”================ Working on pairs ================ Perturbation Pose Coordinates Traceback (most recent call last): File "../../prepare_dual_topology.py", line 3157, in verbosity=arguments.verbose) File "../../prepare_dual_topology.py", line 660, in prepare_output_scripts_data temp_str = template_section['header'] File "/home1/tbaba/ubuntu20/anaconda3/envs/PyAutoFEP/lib/python3.6/configparser.py", line 1233, in getitem raise KeyError(key) KeyError: 'header'

The input is as follows. ”# This is the configuration section for prepare_dual_topology.py (which is the only section in this file, but the [ section ] is mandatory) [prepare_dual_topology]

”# Read ligands and topologies from this folder input_ligands = lig_data

”# Read the macromolecule structure from this file structure = receptor_data/5q17_processed.pdb

”# This is the force field directory, it will be copied to each perturbation dir and used to prepare the MD systems extradirs = oplsaam.ff

”# Options controlling the core-constrained superimposition ”# First select the use of it instead of reading all ligand poses pose_loader = superimpose

”# Use this pose as the reference for the superimposition poses_reference_pose_superimpose = receptor_data/9mv.pdb

”# Name of the output. This will be a self-extracting bash file perturbations_dir = /work113/PyAutoFEP/docs/tutorial01_sample4/tutorial

”# Sets the path to GROMACS executable in the run node, uncomment and modify if needed. ”# gmx_bin_run = /usr/local/bin/gromacs

”# Options controlling the output, see manual for more info. Uncomment as needed ”# FEP legs are to be submitted to a slurm scheduler output_scripttype = pbs ”# Run these commands at the beginnig of the jog (useful to load modules, importing libs) ”#output_runbefore = module load python3; module load cuda ”# Fine-tune job resources output_resources = all_nodes:1; all_cpus:10; all_gpus:1 ”# Use a python file instead of a binary during the collect step output_collecttype = python

baba-hashimoto avatar Sep 15 '21 10:09 baba-hashimoto

@baba-hashimoto This is another bug (#7), sorry. I just fixed it in trunk. Would you mind to update and retry?

Thanks

luancarvalhomartins avatar Sep 15 '21 13:09 luancarvalhomartins

Thanks for your quick support. it works fine.

baba-hashimoto avatar Sep 16 '21 00:09 baba-hashimoto

After the tutorial, it seems that this issue is no longer relevant. I hope I will be able to add another tutorial soo Jupyter is not really the suggested usage of of PyAutoFEP, but should anything interesting using it's API pops up, I'll add a doc. Hopefully, someone else can also write tutorial and documentation.

luancarvalhomartins avatar Jun 20 '23 08:06 luancarvalhomartins