isolator icon indicating copy to clipboard operation
isolator copied to clipboard

Couldn't read sequence

Open er0080808 opened this issue 8 years ago • 2 comments

I do not know how to post a issue, so I paste the code here. Hope you can help me.

>isolator analyze -o xxx.hdf5 -g mm10.fa -p 4 RefSeq_Genes_mm10.gtf xxx.bam the bam file above is the output of STAR and has been sorted the following is the stdout

_           _       _
(_)___  ___ | | __ _| |_ ___  _ __
| / __|/ _ \| |/ _` | __/ _ \| '__|
| \__ \ (_) | | (_| | || (_) | |
|_|___/\___/|_|\__,_|\__\___/|_|

Version: 0.0.2-102-g24bafc0
Instruction set: AVX
[09:29:35] 3827 cassette exons
[09:29:38] 436 retained introns
[09:29:39] 224966 consensus exons
[09:33:02] Too few paired-end reads to estimate fragment length distribution.
[09:35:12] 3' bias: 1.698e-06
[10:08:32] Couldn't read sequence GL456210.1

Estimating fragment weights (xxx.bam):
[==========================================================> ] 98.4%    1:31 ETA
>

the line "Couldn't read sequence GL456210.1" is marked with red color and the output xxx.hdf5 file is about 400 kB

So I don't know what should I do in next step...

er0080808 avatar Apr 19 '17 03:04 er0080808

This error is due to a transcript in your gtf file being on the "GL456210.1" sequence, which is an unplaced contig in the m10 assembly, but that sequence not being present in mm10.fa.

It's a little strict about that, so you either have to use a more complete reference sequence, or filter out entries from the GTF file that are on such sequences.

dcjones avatar Apr 19 '17 19:04 dcjones

Thank you so much for your reply.

er0080808 avatar Apr 20 '17 05:04 er0080808