kallisto icon indicating copy to clipboard operation
kallisto copied to clipboard

finding pseudoalignments for the reads ...Segmentation fault (core dumped)

Open aungthurhahein opened this issue 8 years ago • 12 comments

When trying to get pesudobam file, it gives me the core-dump error. Machine is Ubuntu server with x86-64 architecture.

kallisto quant -i Trinity.fasta.kallisto_idx -l 590 -s 160.95 -o out --pseudobam --single ge50.fasta > out.sam

aungthurhahein avatar Mar 25 '16 09:03 aungthurhahein

What is the error reported? We fixed an issue in v0.42.5 which could be the cause of this. Can you run this on the latest versio 0.42.5

pmelsted avatar Apr 05 '16 12:04 pmelsted

Kallisto version is 0.42.4 and this is the error message:

finding pseudoalignments for the reads ...Segmentation fault (core dumped) 

I will download v0.42.5 and try again. I will get back to you with the outcome.

aungthurhahein avatar Apr 08 '16 04:04 aungthurhahein

I tried with kallisto ver. 0.42.5 with the following command and the error still persists.

Command:

kallisto quant -l 605 -s 136 -i Trinity.fasta.kallisto_idx -o aln_out --pseudobam --single lib.fasta

Program halted with the following error:

@SQ     SN:c10356_g10356_i1     LN:1195
@PG     ID:kallisto     PN:kallisto     VN:0.42.5
Segmentation fault (core dumped)

aungthurhahein avatar Apr 18 '16 04:04 aungthurhahein

In your case you seem to have only a single sequence in your index, can you confirm that this is what you expected?

Can you run kallisto quant without the pseudobam, I'm just trying to isolate whether there is a problem with the pseudobam or other parts.

can you also report what is written to stderr when you run your command as kallisto quant -l 605 -s 136 -i Trinity.fasta.kallisto_idx -o aln_out --pseudobam --single lib.fasta > lib.sam

pmelsted avatar Apr 19 '16 10:04 pmelsted

The index file has more than one sequence. I just reported the end of the stdout.
I can run kallisto quant without generating pseudobam successfully.

This is the output of both stdout and stderr:

[quant] fragment length distribution is truncated gaussian with mean = 605, sd = 136
[index] k-mer length: 31
[index] number of targets: 10,357
[index] number of k-mers: 5,947,007
[index] number of equivalence classes: 24,998
[quant] running in single-end mode
[quant] will process file 1: /colossus/home/anuphap/EST/EST_lib_IDs/pm/slect_bytissues_pm_chula/PM82_wTempLibID_04092014.txt.PmTwI.seqID.fasta
[quant] finding pseudoalignments for the reads ...@HD   VN:1.0
@SQ     SN:c0_g0_i1     LN:216
@SQ     SN:c1_g1_i1     LN:374
@SQ     SN:c2_g2_i1     LN:197
...
@SQ     SN:c10354_g10354_i1     LN:594
@SQ     SN:c10355_g10355_i1     LN:682
@SQ     SN:c10356_g10356_i1     LN:1195
@PG     ID:kallisto     PN:kallisto     VN:0.42.5
Segmentation fault (core dumped)

Also, "core.xxxx" file is written inside the working directory.

aungthurhahein avatar Apr 19 '16 11:04 aungthurhahein

The sequences you are aligning have the ending .fasta are they truly FASTA entries and not FASTQ. Because pseudoalignment outputs SAM files which are required to have a quality string kallisto (probably) fails because it has no quality string.

I'll have to check for this a bit more carefully when doing pseudoalignment.

kallisto never uses the quality values so you can supply a dummy value, essentially converting the FASTA file to a FASTQ files.

You can try this by just converting the first few sequences of the input file to FASTQ

pmelsted avatar Apr 19 '16 13:04 pmelsted

Yes.I confirmed that .fasta file has no quality file. I didn't mention it before because don't expect that it can be the cause of the issue.

I will test with .fastq file format and report the outcome soon.

aungthurhahein avatar Apr 20 '16 03:04 aungthurhahein

I also ran into a segfault when generating the pseudobam, but due to a slightly different problem . I was running kallisto using process substitution to deal with an interleaved paired end file e.g

kallisto quant -t 8 -i kallisto.idx -o my_sample --pseudobam <(seqtk seq -1 interleaved.fq) <(seqtk seq -2 interleaved.fq)

Kallisto runs perfectly fine without the --pseudobam flag, but it crashes if I request the pseudobam.

I figured the pseudobam needs re-reading the fastq files, so I tried doing the split beforehand and then the seg fault does not happen (runs fine).

Would be nice to add this to the docs at least :). A nice would have also would be support for interleaved paired end files :)

maubarsom avatar Jul 10 '18 12:07 maubarsom

Hello, I face this problem: " [ bam] writing pseudoalignments to BAM format .. Segmentation fault" and I have no idea how to fix it. I have smartseq.2 single reads, dual indexed ( this is who the fastq reads look like: @NB551291:160:H55CJBGXF:1:11101:12947:14932 1:N:0:TAAGGCGA+GCGATCTA GGCGTGTCCCGCGCGTGTGGGGGGAACCTCCGCGTCGGTGTTCCCCCGCCGGGTCCGCCCCCCGGGCCGCGGTTTT + AAAA/EAAAEEEA/EEAEEEAEE/E/EEEEEEAEA/EEEEEEEEEEEEEAEEEE/E/EAEEAEEE6AE/</EA/// )

I run this pipeline: [user@vm-129-49 mouse1.fastq_gz]$ kallisto quant -i /ad/vlachou/scRNAseq.2/kallisto_analysis/gencode.vM24.transcripts.idx --output-dir /ad/vlachou/scRNAseq.2/kallisto_analysis/kallisto_quant/gencode_indexed/mouse1 --pseudobam --genomebam --gtf /vlachou/scRNAseq.2/kallisto_analysis/gencode.vM24.annotation.gtf.gz --single -l 530 -s 150 -t 16 *fastq.gz

this is the outcome message: [quant] fragment length distribution is truncated gaussian with mean = 530, sd = 150 [index] k-mer length: 31 [index] number of targets: 142,552 [index] number of k-mers: 120,672,054

[quant] finding pseudoalignments for the reads ... done [quant] processed 482,819,438 reads, 208,880,499 reads pseudoaligned [ em] quantifying the abundances ... done [ em] the Expectation-Maximization algorithm ran for 1,273 rounds [ bam] writing pseudoalignments to BAM format .. Segmentation fault I tried the same with esnembl as reference but I get the same problem.

If anyone could help me out, it would be great! Thanks

Evi-050 avatar Apr 14 '20 18:04 Evi-050

Any idea if this issue has been resolved yet. I am also getting something very similar:

[  bam] writing pseudoalignments to BAM format .. /spin1/swarm/kopardevn/M0tDGHNewa/cmd.10: line 1: 12564 Segmentation fault      ( kallisto quant -i mm10_M21 -o TreatmentB_S72 --bias --plaintext
--fusion --rf-stranded -t 56 --pseudobam --genomebam --gtf genes.gtf -c mm10.genome trim/TreatmentB_S72.R1.trim.fastq.gz trim/TreatmentB_S72.R2.trim.fastq.gz )

kopardev avatar Jul 31 '20 14:07 kopardev

So, personally, I went with STAR since I was not in a hurry, but someone in another post suggested going back to the older version that works. But frankly, I didn't try it. Also if I remember when I removed the "--pseudobam --genomebam --gtf genes.gtf" and run for example "kallisto quant -i index -o output --single -l 200 -s 20 file1.fastq.gz file2.fastq.gz file3.fastq.gz" it worked.

Very good luck!

Evi-050 avatar Jul 31 '20 14:07 Evi-050

keeps happening to me too in kallisto 0.46.2:

[quant] finding pseudoalignments for the reads ...

[quant] done
[quant] processed 250,960,675 reads, 156,761,018 reads pseudoaligned
[   em] quantifying the abundances ... done
[   em] the Expectation-Maximization algorithm ran for 1,513 rounds
[  bam] writing pseudoalignments to BAM format .. [1]    2673 segmentation fault

works when removing the --genomebam flag, but I'd really like to get the bamfile out of this

redst4r avatar Aug 08 '20 23:08 redst4r