[FEATURE] Tumor mutational burden (TMB) scoring
nf-core/sarek feature request
Is your feature request related to a problem? Please describe
TMB / tumor mutational burden is a key metric for many oncology related projects and thus could be a feature requested by many users of the pipeline at some point to be computed similar to MSI status (which is already done in the pipeline).
Describe the solution you'd like
Have some method / way of computing TMB status and reporting this to the user in the main MultiQC table ideally.
Describe alternatives you've considered

Easy to use feature would be this here: https://github.com/bioinfo-pf-curie/TMB
A good explanation how to do the job potentially as this evaluates several panels + WES that could be of interest: https://www.nature.com/articles/s41598-020-68394-4
potentially worth looking at too: https://github.com/bcgsc/TMBur
Also a good tool apparently, estimating TMB: https://github.com/bioinform/ecTMB
Strong tendency towards using the curie approach because:
- conda recipe --> biocontainer in place ✔️
- good examples / documentation in place ✔️
- straightforward output for multiqc (tsv tables are produced automatically --> could just report this in separate table in MultiQC report in Sarek) ✔️
All these points above are sometimes hard to achieve if you have to do them yourself --> good idea to rely on this instead 👍🏻
* straightforward output for multiqc (tsv tables are produced automatically --> could just report this in separate table in MultiQC report in Sarek) heavy_check_mark
MultiQC module possibility even?
Hey! I started looking into this. I actually can't find the curie tool on anaconda.org. From the README I think, they provide an environment file to set up, in which then the python scripts can be run. then also no biocontainers would exists. Did you find anything else @apeltzer ?
No, nothing else - I suspect that they have their own conda channel where they provide everything this uses / needs.
Hi there, just be in touch with @FriederikeHanssen ! So indeed, there is no conda repo, just because I'm not familiar with making such package :) But we'll be happy to do it of course !
Hey everyone, I've just made a bioconda recipe for the curie tmb tool --> see here: https://github.com/bioconda/bioconda-recipes/pull/35393
@nservant you could also link to that recipe in the future, as it would allow people to directly install the tool without having to setup a separate environment, e.g.:
mamba install -c bioconda -c conda-forge tmb=1.3.0
Running the tools then works with:
pyTMB.py <parameters (and the genome size script works the same way)
❤️ thank you for adding this! As soon as biocontainers is there, I'll add the module.
@apeltzer started working on it. Module is in the making. For adding it to sarek, we need to add config files for each caller (except mutec2). and then also discuss on how to best make them available. It will take me a bit longer then I hoped.
Hello @FriederikeHanssen @apeltzer,
I am the co-developper of pyTMB, I'm working with Nicolas Servant.
Thanks a lot for your help for bioconda ! I was struggling with the build.sh part.
Let me know If I can help for the config files or for testing
Hello @FriederikeHanssen @apeltzer,
I am the co-developper of pyTMB, I'm working with Nicolas Servant.
Thanks a lot for your help for bioconda ! I was struggling with the build.sh part.
Let me know If I can help for the config files or for testing
Hi @tomgutman !
This PR has been a long time in the making due to me no t having time, but also because creating the yml files seems very daunting. Do you already have some for VEP or any of the variantcallers that we use, such as strelka, and freebayes?
Hello @apeltzer @FriederikeHanssen
This is Raghavendra, I am currently using pyTMB.py for finding Tumor mutational burden with having VEP annotated vcf output file. But I am unable to use pyTMB.py due to non-availability of specific config.yml file for VEP. Please help me in creating it.
Hi @Raghu9721 ! Sorry just saw this, I am not the pyTMB tool developer. I would ask here to get help from them for this: https://github.com/bioinfo-pf-curie/TMB
Hi @Raghu9721, @FriederikeHanssen , Happy to help on that question, but I never used VEP before. The only thing I would need is a VCF file annotated with VEP to see how the different fields can be parsed.
Hi, I am using Sarek v.3.4.1 to extract VEP annotations from my samples. My question is if threre is any consensous method to calculate TMB. Is it the tool generated by Institut Curie - TMB analysis the most used? What's your recommendation? Really thanks!!!