SALSA icon indicating copy to clipboard operation
SALSA copied to clipboard

empty alignments_sorted.txt file while generating .hic file

Open gitcruz opened this issue 4 years ago • 3 comments

Dear Marbl Team,

I've ran SALSA2 "successfully" on a genome assembly and scaffold N50 increased from 12Mb to 16Mb. In principle, everything looks good to me, except to what I considered a few warnings.

However, when I run the converter script to obtain the .hic file I am getting this error: "alignments_sorted.txt does not exist or does not contain any reads". I realized that the expected input file for converter.sh (i.e. alignment_iteration_1.bed) is actually empty. In fact, the .bed files for the three iterations are empty.

Do you know what could be the reason for this error? I would love to fix it without re-running SALSA2...

Cheers, F

gitcruz avatar Sep 28 '20 15:09 gitcruz

Did you run it with -m yes option? If not, then you can use the bed file that you provided to SALSA as an input to the .hic file creation script.

ghuryejay avatar Sep 28 '20 21:09 ghuryejay

Hi Jay,

I certainly used the -m option to detect misassemblies with Arima Hi-C data. However, I made a mistake and I passed it twice (-m 1000 instead of -c 1000!!!)...but the program did not complain. I think I should re-run the scaffolding and compare the results. Please let me know your opinion on this.

Below the commandline used: python /home/devel/fcruz/SALSA2_Scaffolder/SALSA/run_pipeline.py -s 1340000000 -m 1000 -i 3 -a base_assembly.fa -l base_assembly.fa.fai -b Arima.1_2.hicup.bed -e GATC,GANTC -o out -m yes

Thanks, F

gitcruz avatar Sep 29 '20 07:09 gitcruz

I'm experiencing this same issue. all my .bed files are empty so an alignments.txt is not written.

sheinasim avatar Mar 18 '21 20:03 sheinasim