metaGEM icon indicating copy to clipboard operation
metaGEM copied to clipboard

other inputs

Open smb20200615 opened this issue 2 years ago • 9 comments

Is there a way to input MAGs that we have generated using other pipelines into the tool? If so, at what point?

smb20200615 avatar Oct 26 '21 16:10 smb20200615

Hi @smb20200615,

It really depends on what you want to do, but I see two steps where it seems reasonable to integrate MAGs from other pipelines:

  1. At the bin_refinement step: here you would need 3 bin sets generated from a single assembly (i.e. same contigs/headers but binned differently). Maybe you have generated MAGs with a new binner or alternative MAG pipeline, so you could refine + reassemble them to get a final consensus set.

  2. At the metabolic model reconstruction step: here you would use ORF-annotated protein bins to build metabolic models with CarveMe. Then you can simulate them within communities using SMETANA to predict metabolic interactions.

Please let me know if you had any other ideas in mind, or if you have further questions.

Best wishes, Francisco

franciscozorrilla avatar Oct 26 '21 16:10 franciscozorrilla

Hi Francisco,

Thank you for your thorough comment. The reason I am struggling to make MAGs using your pipeline is because I don't have the account name for my cluster account. When I delete the account line, then I get errors during job submission. Perhaps the better question is how to run without the account name line

"__default__" : {
        "account" : "your-account-name",
        "time" : "0-06:00:00",
        "n" : 48,
        "tasks" : 1,
        "mem" : 180G,
        "name"      : "DL.{rule}",
        "output"    : "logs/{wildcards}.%N.{rule}.out.log",
},

smb20200615 avatar Oct 27 '21 10:10 smb20200615

Hi @smb20200615,

Indeed that is quite strange, have you previously/successfully submitted any jobs on your cluster without an account name? I do not think that this should be possible, unless it is your own cluster/workstation. In my institution's SLURM based cluster one can can use the mybalance command to view your accounts. If this does not work then I would suggest contacting your cluster support team or have a look at any documentation provided by your institiution's cluster.

Best wishes, Francisco

franciscozorrilla avatar Oct 27 '21 11:10 franciscozorrilla

also sometimes these commands fail bash metaGEM.sh -t metabat -j 2 -c 24 -m 80 -h 10 bash metaGEM.sh -t maxbin -j 2 -c 24 -m 80 -h 10 (it seems there were no bins produced). Should we just proceed with downstream steps in that case?

smb20200615 avatar Oct 29 '21 15:10 smb20200615

Indeed, you can simply proceed to bin refinement and reassembly. let me know how it goes!

franciscozorrilla avatar Oct 29 '21 16:10 franciscozorrilla

sorry for the subsequent question. When I run refinement it still tries to rerun the samples that failed. I have been following the steps of the tutorial.

smb20200615 avatar Nov 01 '21 17:11 smb20200615

No problem, happy to help with your questions. OK that makes sense, since the binRefine rule takes in 3 inputs and you are missing one of them. https://github.com/franciscozorrilla/metaGEM/blob/eb0860945fd0d8efa8495aeb441b55969c4e97b1/Snakefile#L879-L883 I would recommend trying to "trick" snakemake into thinking it already created the files by creating a dummy folder for those samples that failed to generate any bins with maxbin, e.g. mkdir maxbin/sample/sample.maxbin-bins. I believe that this is what I have done in the past, since metaWRAP will accept an empty folder and just proceed with the remaining draft bin sets.

Please let me know if this works for you. If so, then I will try to modify the maxbin rule so that it creates this dummy directory if the binning fails to generate any MAGs.

franciscozorrilla avatar Nov 01 '21 17:11 franciscozorrilla

Thank you so much! That fixed it. I was wondering if the metabolic models were done just for the prokaryotic MAGs. Also, do you have guidance on how to adapt the media config parameter for our biome of interest?

smb20200615 avatar Nov 07 '21 16:11 smb20200615

Glad to hear it worked!

Indeed, CarveMe only reconstructs models for prokaryotic MAGs. Regarding media, this is something that you would have to adapt/design based on literature/domain knowledge and using metabolite IDs from the bigg database. What biome are your samples from?

franciscozorrilla avatar Nov 08 '21 11:11 franciscozorrilla