bgcflow icon indicating copy to clipboard operation
bgcflow copied to clipboard

Running BGCFlow in DTU LSF HPC

Open matinnuhamunada opened this issue 4 months ago • 0 comments

For DTU students/staff that wanted to run BGCFlow on the LSF HPC facility, follow these steps:

  • Access the HPC by following the guide from here https://www.hpc.dtu.dk/?page_id=2501
# go to login node
ssh [email protected] # access to login node
  • install miniforge on the login node: https://github.com/conda-forge/miniforge#mambaforge

  • Get extra space by requesting a scratch dir by emailing the HPC support at [email protected], https://www.hpc.dtu.dk/?page_id=927

# once you have been assigned a scratch dir, create a symlink to your home dir
SCRATCH_DIR="/work3/<user_id>/" # change user id accordingly

# create a symlink to the scratch dir
ln -s $SCRATCH_DIR drive
  • install BGCFlow on the login node: https://github.com/NBChub/bgcflow?tab=readme-ov-file#quick-start
  • create a symlink to BGCFlow on the scratch dir
cd drive/
mkdir bgcflow
cd bgcflow
# we keep the config and workflow in the home directory as it is being backed up
ln -s ~/bgcflow/workflow/ workflow
ln -s ~/bgcflow/config/ config

which results to something like this:

gbarlogin1(matinnu) $ tree
.
├── config -> /zhome/b2/0/153431/bgcflow/config/
└── workflow -> /zhome/b2/0/153431/bgcflow/workflow/

2 directories, 0 files
(base) ~/drive/bgcflow
  • Install the lsf plugin
linuxsh # go to one of the worker node

conda run -n bgcflow mamba install bioconda::snakemake-executor-plugin-lsf -y
  • execute the workflow
# IMPORTANT, run this from the worker node using linuxsh 
cd ~/drive/bgcflow/
conda run -n bgcflow snakemake --executor lsf --default-resources lsf_project=<project name> lsf_queue=<hpc> --use-conda --jobs <njobs>

or using a profile such as:

jobs: 1
executor: lsf
default-resources:
  - 'lsf_project=matinnu'
  - 'lsf_queue=hpc'

to find the right queue, use: bqueues -u <user id>

  • The final structure will look like this:
gbarlogin1(matinnu) $ (cd ~ && tree bgcflow/ drive/ -L 2)
bgcflow/
├── CITATION.cff
├── config
│   ├── config.yaml
│   ├── Lactobacillus_delbrueckii
│   └── lanthipeptide_lactobacillus
├── Dockerfile
├── envs.yaml
├── LICENSE
├── profiles
│   └── config.yaml
├── README.md
├── resources
│   └── README.md
└── workflow
    ├── Alleleome
    ├── BGC
    ├── bgcflow
    ├── Database
    ├── envs
    ├── lsabgc
    ├── Metabase
    ├── misc
    ├── notebook
    ├── ppanggolin
    ├── report
    ├── Report
    ├── rules
    ├── rules_bgc.yaml
    ├── rules_ppanggolin.yaml
    ├── rules.yaml
    ├── schemas
    ├── scripts
    └── Snakefile
drive/
└── bgcflow
    ├── config -> /zhome/b2/0/153431/bgcflow/config/
    └── workflow -> /zhome/b2/0/153431/bgcflow/workflow/

17 directories, 19 files
(base) ~

matinnuhamunada avatar Oct 10 '24 11:10 matinnuhamunada