FALCON icon indicating copy to clipboard operation
FALCON copied to clipboard

fc_ovlp_filter adding support for las files rather only preads_lasfiles.fofn

Open mictadlo opened this issue 6 years ago • 2 comments

Hi, Could you please add support to fc_ovlp_filter for las files rather only preads_lasfiles.fofn? At the moment I do it with the following bash script.

#!/bin/bash

rm fc_ovlp_filter_cmds.txt
for filename in $(find . -type f -name "assemblyDB.*.las"); do
   fn=$(basename $filename)
   echo $fn
   no=$(ls $fn | sed 's/assemblyDB.//' | sed 's/.las//')
   echo $no
   $(ls $fn > ${fn}.fofn)
   
   echo "fc_ovlp_filter --db assemblyDB --max_diff 100 --max_cov 100 --min_cov 1 --bestn 10 --n_core 1 --fofn ${fn}.fofn --min_len 6973 --out-fn preads.${no}.ovl" >> fc_ovlp_filter_cmds.txt
done

The above solution allows me to distrubute the task on multiple nodes.

Can I just do cat preads.*.ovl > preads.ovl or do I have to sort the file?

Thank you in advance.

Michal

mictadlo avatar Feb 26 '18 01:02 mictadlo

Your solution is fine. The order of preads.ovl does not matter, aside from reproducibility of results.

But are you sure that step needs to be split and parallelized? What is your resource constraint? We would split that ourselves if we were aware of a problem.

pb-cdunn avatar Mar 30 '18 17:03 pb-cdunn

I think it used a lot of memory and took a while to run.

mictadlo avatar May 18 '18 02:05 mictadlo