moFF icon indicating copy to clipboard operation
moFF copied to clipboard

Settings to accelerate execution time

Open veitveit opened this issue 4 years ago • 2 comments

We are using moFF on a larger dataset (27 label-free runs) and it takes a very long time (about 40h), even with setting for 16 threads and 150GB RAM.

Are there any parameter settings that can shorten the execution time? I should be missing something here because the paper says that this is fast method applicable to large datasets.

And as a motivator: MaxQuant does take less than half of the time.

Data set: http://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD001819 All parameter values are default (i.e. not provided via command-line) but the mass tolerance which is 5.

The input files are from PeptideShaker and raw files.

moff_all is run from a docker container that has moff=2.0.3 installed as conda package.

veitveit avatar Jul 05 '20 07:07 veitveit

Hi,

Sorry for my late reply, at moment I am working outside academia and I follow partially moFF in my free time.

A couple of questions :² Do you ran the matching between runs across all the 27 runs ? if yes , I can imagine that the number of matched peptides is really big in each run, this should one possible explanation.

How many PSM do you have in each runs as average after fdr calculation ?

Do you use the "--match_filter" option or not ? if yes this could add time in the computation.

Eventually you try to set "--xic_length" to 2 or 2.2 minutes, to see if it gains some speed.

Maux82 avatar Jul 10 '20 14:07 Maux82

Hi @Maux82,

Thanks a lot for the help!

I am running the full set of 27 runs, and I am not using the --match_filter option. Does setting the filtering to true speed up things or the opposite?

There are around 10,000 PSMs per run.

I will try to decrease the xic_length and how it will perform.

veitveit avatar Jul 12 '20 17:07 veitveit