Couldn't read sequence
I do not know how to post a issue, so I paste the code here. Hope you can help me.
>isolator analyze -o xxx.hdf5 -g mm10.fa -p 4 RefSeq_Genes_mm10.gtf xxx.bam
the bam file above is the output of STAR and has been sorted
the following is the stdout
_ _ _
(_)___ ___ | | __ _| |_ ___ _ __
| / __|/ _ \| |/ _` | __/ _ \| '__|
| \__ \ (_) | | (_| | || (_) | |
|_|___/\___/|_|\__,_|\__\___/|_|
Version: 0.0.2-102-g24bafc0
Instruction set: AVX
[09:29:35] 3827 cassette exons
[09:29:38] 436 retained introns
[09:29:39] 224966 consensus exons
[09:33:02] Too few paired-end reads to estimate fragment length distribution.
[09:35:12] 3' bias: 1.698e-06
[10:08:32] Couldn't read sequence GL456210.1
Estimating fragment weights (xxx.bam):
[==========================================================> ] 98.4% 1:31 ETA
>
the line "Couldn't read sequence GL456210.1" is marked with red color and the output xxx.hdf5 file is about 400 kB
So I don't know what should I do in next step...
This error is due to a transcript in your gtf file being on the "GL456210.1" sequence, which is an unplaced contig in the m10 assembly, but that sequence not being present in mm10.fa.
It's a little strict about that, so you either have to use a more complete reference sequence, or filter out entries from the GTF file that are on such sequences.
Thank you so much for your reply.