ebi-metagenomics-cwl
ebi-metagenomics-cwl copied to clipboard
containers
- [ ] seqprep
- [ ] trimmomatic
- [ ] evaluate https://github.com/common-workflow-language/workflows/blob/master/tools/trimmomatic-Dockerfile
- [ ] "biopython"
- [ ] hmmer
- [ ] evaluate https://hub.docker.com/r/comics/hmmer/~/dockerfile/
- [ ] FragGeneScan
- [x] InterProScan would be about 5 GiB unless reference data is made separate
- [ ] "QIIME"
- [ ] metaspades 3.9.0 is at https://quay.io/repository/biocontainers/spades @mr-c updated the Debian package to latest release: https://lists.debian.org/debian-med/2017/04/msg00022.html
- [x] infernal
- [ ] esl-*
- [ ] MAPSeq
Recommended: Search the BioContainer registry for existing containers
If missing, the ideal method is to create a bioconda package. This will automatically be turned into a Docker & Singularity container once merged into their project.
Note: for packages with large reference data sets (I'm looking at you, InterProScan) they shouldn't be part of the conda package, but downloaded via a post-link script (and deleted via a pre-unlink
script)
@mr-c FYI: We use Hmmer contianer stated in https://bioconda.github.io/recipes/hmmer/README.html I fixed easel dependences in bioconda recipies in https://github.com/bioconda/bioconda-recipes/pull/9772 and you can now take advantage of https://quay.io/repository/biocontainers/hmmer?tab=tags
InterProScan is also available via biocontainers
member databases are avialable via
$ wget ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/iprscan/5/5.30-69.0/alt/interproscan-data-5.30-69.0.tar.gz
$ wget ftp://ftp.ebi.ac.uk/pub/databases/metagenomics/iprscan/5/5.30-69.0/alt/interproscan-data-5.30-69.0.tar.gz.md5
$ md5sum -c interproscan-data-5.30-69.0.tar.gz.md5
$ tar -zxf interproscan-data-5.30-69.0.tar.gz
example:
$ docker run --rm --name interproscan -v $PWD:/data -v $PWD/interproscan-5.30-69.0/data:/opt/interproscan/data biocontainers/interproscan:v5.30-69.0_cv1 ./interproscan.sh -dp -f tsv -o /data/test_out.ipr -i test_proteins.fasta
Unable to find image 'biocontainers/interproscan:v5.30-69.0_cv1' locally
v5.30-69.0_cv1: Pulling from biocontainers/interproscan
8ee29e426c26: Pull complete
6e83b260b73b: Pull complete
e26b65fd1143: Pull complete
40dca07f8222: Pull complete
b420ae9e10b3: Pull complete
57ac0ea5f4fb: Pull complete
049277243025: Pull complete
88da8f102c18: Pull complete
Digest: sha256:71f9a395ce344328e4f3be1d60478a57643adecf1b2c7165a3da022a7973ac39
Status: Downloaded newer image for biocontainers/interproscan:v5.30-69.0_cv1
30/07/2018 10:03:38:832 Welcome to InterProScan-5.30-69.0
30/07/2018 10:03:44:658 Running InterProScan v5 in STANDALONE mode... on Linux
30/07/2018 10:03:58:448 Loading file /opt/interproscan/test_proteins.fasta
30/07/2018 10:03:58:450 Running the following analyses:
[CDD-3.16,Coils-2.2.1,Gene3D-4.2.0,Hamap-2018_03,MobiDBLite-1.5,Pfam-31.0,PIRSF-3.02,PRINTS-42.0,ProDom-2006.1,ProSitePatterns-2018_02,ProSiteProfiles-2018_02,SFLD-3,SMART-7.1,SUPERFAMILY-1.75,TIGRFAM-15.0]
Pre-calculated match lookup service DISABLED. Please wait for match calculations to complete...
30/07/2018 10:04:08:561 25% completed
30/07/2018 10:04:23:429 50% completed
30/07/2018 10:04:35:577 75% completed
30/07/2018 10:04:54:557 90% completed
30/07/2018 10:05:14:991 100% done: InterProScan analyses completed
i'd like to find Dockerfiles that had these random python scripts: go_summary_pipeline-1.0.py krona_setup.py rnaMaskingStep.py oneLineFasta.py extract_sig_coords.py