SVDSS icon indicating copy to clipboard operation
SVDSS copied to clipboard

Segmentation fault while searching SFS between two assemblies

Open LYC-vio opened this issue 1 year ago • 4 comments

Hi,

Thank you for developing this amazing tool. Recently, I've tried to use SVDSS to find SFSs between two assemblies (or an assembly and the reference genome), however, SVDSS failed in the search step with a segmentation fault.

The reference I used was hg19-2.1.0, and the assembly was downloaded from here, HG02080.paternal.f1_assembly_v2.fa.gz.

the command I used were:

${SVDSS} index --fasta ${ref} --index ${idx_file} -b --threads 10
${SVDSS} search --index ${idx_file} --fastq ${HG02080_assembly} --workdir ./work_dir --threads 10

What was the possible reason for this issue? By the way, I'm using the v1.0.5 binary

Thank you

LYC-vio avatar Jul 23 '23 12:07 LYC-vio

After removing -b, SVDSS successfully extracted SFSs. But it is kind of strange it doesn't work with -b

LYC-vio avatar Jul 23 '23 14:07 LYC-vio

Hi, yeah, you got the issue! With -b,--binary, the index is stored in binary format but this type of index is not queriable (actually I don't remember if this is required by the ropebwt2 implementation we are based on or not). However, this is useful when you need to do an incremental construction of the index (e.g., you want to index more fasta/q files but without concatening them)

ldenti avatar Jul 24 '23 07:07 ldenti

Hi @ldenti ,

Thank you! What should I do if i want to search on a binary output?

like:

${SVDSS} index --fasta ${ref} --index ${idx_file} -b --threads 10

for i in {asm1} {asm2} {asm3} {asm4}
do
    ${SVDSS} index --fasta ${i} --append ${idx_file} --threads 10
done 

${SVDSS} search --index ${idx_file} --fastq ${HG02080_assembly} --workdir ./work_dir --threads 10

Is this the right way to do it?

Thanks again

LYC-vio avatar Jul 24 '23 16:07 LYC-vio

You need to create the binary index at each iteration except the last one, where you store the index in FMD format:

${SVDSS} index --fasta ${ref} --index ${idx_file} --binary
for i in {asm1} {asm2} {asm3} 
do
    ${SVDSS} index --fasta ${i} --append ${idx_file} --binary --index ${idx_file}.tmp
    mv ${idx_file}.tmp ${idx_file}
done
${SVDSS} index --fasta {asm4} --append ${idx_file} --index ${idx_file}.fmd

then you can search against the index stored in fmd (and not binary):

${SVDSS} search --index ${idx_file}.fmd --fastq ${HG02080_assembly} --workdir ./work_dir --threads 10

Let me know if this works

ldenti avatar Jul 24 '23 16:07 ldenti