masurca icon indicating copy to clipboard operation
masurca copied to clipboard

error extracting reads for scaffolding

Open gitcruz opened this issue 3 years ago • 2 comments

Dear Aleksey,

I have run the latest version on the grid (using slurm) to create the megarreads. After the create_megarreads job array finished I simply rerun ./assemble.sh and I obtained a flye assembly with megareads coverage 21x ). However, the post-assembly scaffolding step failed with this error:

what(): basic_ios::clear /home/devel/fcruz/bin/programs/MaSuRCA-4.0.4/bin/masurca_scaffold.sh: line 112: 13973 Aborted $MYPATH/ufasta extract -f <(awk '{print $NF}' $REFN.$QRYN.coords) $QRY > $REFN.$QRYN.reads.fa.tmp

I think the reason is because the input raw reads are given in fastq format instead of fasta.

Usage: masurca_scaffold.sh -r -q -t -m <minimum matching length, default:5000> -o <maximum overhang, default:1000>

Are you contemplating to adjust this script to also accept fastq and fastq.gz ?

For now, i will convert the original input raw reads to fasta and rerun the scaffolding script. I guess this is the right thing to do, using the megareads for scaffolding does not seem right.

Thanks, Fernando

gitcruz avatar Jul 21 '21 16:07 gitcruz

The input files for masurca scaffolder must be fasta. Good point, I will add conversion of fastq to fasta automatically in the future version.

alekseyzimin avatar Aug 19 '21 19:08 alekseyzimin

Hi Aleksey,

Adding it would be nice because it will more flexible.

By the way, any hints about how to solve the problem with find_repeats.pl? it was raised here issue #242 masurca_scaffold.sh error

Thanks, Fernando

gitcruz avatar Sep 01 '21 07:09 gitcruz