Setting comet search parameters
I'm trying out this project for processing DDA TMT data but I'm having some trouble with specifying comet settings such as fragment_bin_tolerance . The docs specify that Caution: for Comet we are estimating the fragment_bin_tolerance parameter based on this automatically.
Is there no way to pass this in to comet explicitly? Perhaps related to #146
First of all, thanks for using the pipeline, you are one of our first "external users". Then, be patient with us. Here, https://github.com/bigbio/quantms/blob/eb8ee5d95a866164899503039aa1172261c602aa/modules/local/openms/thirdparty/searchenginecomet/main.nf#L22 the pipeline using the fragment tolerance we guess the fragment_bin_tolerance . We mainly do this because most of the search engines uses fragment mass tolerances in ppm or Da rather than the bin idea.
After testing with multiple datasets using the guess system, which actually was originally suggested in one thread of comet, we think the pipeline works perfectly well. If you really want to configure and pass that parameter to comet, we can define the parameter. Let us know what do you think?
Thank you for the reply! You have my patience - no worries on that front, and thank you for putting this project out there!
I guess in an ideal world one would be able to pass in any additional parameter to any of the (sub-)workflows - I believe that this is what @jpfeuffer suggested in #146. I'm new to nextflow so I don't know how difficult this is to implement or if it's considered a good practice or not.
Back to the specific issue here - I'm sure that your defaults work well, but say I want to re-process a dataset that was searched with comet with fragment_bin_offset of 0.4 and fragment_bin_tol of 1.0005 - these are the suggested defaults for some configurations:
For ion trap data with a fragment_bin_tol of 1.0005, it is recommended to set fragment_bin_offset to 0.4.
So in that sense it's likely that this particular combination of parameters is common, but it cannot be reproduced exactly with quantms
Using a fragment_tolerance of > 50 ppm will exactly use the parameters you described. In theory, you can overwrite parameters that are not available on the command line by providing a config file and writing:
process {
withName: <module> {
ext.args = [ // Assign either a string, closure which returns a string
'--flag',
'--param1 abc'
].join(' ') // Join converts the list here to a string.
}
}
But I did not try yet. For SEARCHENGINECOMET should be fine.
@jpfeuffer @daichengxin I think we can add to the extra parameters the posibility to pass directly the fragment_bin_offset for those advance users as @radusuciu that wants to play with the search engine original parameters.
I don't know. Where do we stop then? I think if users know about these special flags, they also can pass a config file. Let's wait if it works.
Personally I'm totally fine with config file approach - it also gives you built in future compatibility when, for instance, comet adds or changes a parameter.
We haven't tested that approach if it works for you please let us know.
No one complained about this anymore, I guess the config approach is fine for 99% of the use cases.