usegalaxy-playbook
usegalaxy-playbook copied to clipboard
Increase memory and create a dynamic rule for seqtk_sample
From @bwlang on Gitter:
@natefoo : is it reasonable to increase the allowable RAM for seqtk sample to say 10G on usegalaxy.org? With fixed numbers of reads, it needs RAM proprotional to the targeted number of reads. I’m going to send a tool patch to enable 2-pass mode that is less memory intensive as well...
TODO:
- [x] Increase static memory allocation to 16GB
- [ ] Allocate memory dynamically based on input size
@bwlang could you comment with some details on the proportion needed?
16GB allocated in 4deaf5539c03563d96142d8a1cec6c0548c6d3d8.
I did a quick experiment...
for reads in 100 1000 10000 100000 1000000; do /usr/bin/time -v -a -o log seqtk sample 9.2.fastq.gz $reads > /dev/null; done
I fit that data (forcing intercept to 0)
Kbytes = 0.475991*Num Reads

note that this is not a function of input data size, it's a function of output number of reads specified. e.g sampling 100 reads from 1000000000000 reads still requires only about 2kb.
however sampling 100000 reads from 1000000000000 needs 475kb
Perfect, thanks!