bio-cwl-tools
bio-cwl-tools copied to clipboard
The structure of the repository
Here we need to decide the structure of the this repository. As it was agreed, every tool should be placed in a separate directory. The questions are:
- should we keep Dockerfile in the same folder?
- where do we put job files for testing?
- where do we keep input data for testing jobs?
So, I added some files to get started with. Let me know if you don't like the folder structure or naming.
Concerning containers, I propose following guidelines:
- Try to use containers from bioconda/biocontainers
- Otherwise:
- Use only containers for which the dockerfile is publicly available on docker hub.
- If you are the author, you should consider maintaining the container directly from this repository and keep the dockerfile next to the cwl tool. (But this is optional and should not be enforced)
What do you think?
@KerstenBreuer I think that is a great start about the containers
I refactored repository, so all the tools are within the folder tools
. The next level is the name of the software, such as samtools
, bedtools
, etc. Each folder on this level can include multiple cwl files, for example samtools_sort.cwl
, samtools_view.cwl
, etc. Additionally, every folder includes tests
and metadata
subdirectories. In metadata
I put yaml
file to be included in every cwl file. This file will describe the software used in cwl files (basically describes the program from dockerfile). In the tests
folder I'm planning to collect all job files for testings. I don't know yet where to keep input data for these tests, so tests currently doesn't work. In our repository I keep all input data in a separate github repository, which is not good, because eventually the amount of input data will grow very fast. I think later we can put it on some ftp server.
Also, I added some other tools. If the tool with the same name already exists, I add suffix _1...n
to its filename.
I haven't added any Dockerfiles yet. I guess we can keep them in the separate folder dockerfiles
near the folder tools
. I don't think we can put Dockerfile
near the cwl file, because some of cwl files can use the same Dockerfile
.
@michael-kotliar Great to see more tools!
We don't need the top-level tools
directory, that is already in the repository name.
I like the suffix 1..n to add to tools that may be uploaded by multiple contributors, I think keeping docker repos from biocontainers should be fine for the time being. Tests should be included for each tool group folder. I am working on getting some standard test data we can use to do unit tests on all the PRs. We can find a place to centrally store them.
https://github.com/common-workflow-library/bio-cwl-tools/issues/105#issuecomment-707997728 --> Nathan brings up important point of having version folders