Petr Danecek

Results 231 comments of Petr Danecek

@LeeTL1220 Agreed, and this is the right place (and time?) to discuss it. There has been a couple of [emails](http://sourceforge.net/p/vcftools/mailman/message/34104590/) exchanged about this also on the vcftools-spec mailing list, where...

@d-cameron The draft does say that characters with special meaning (such as ';' in INFO, ':' in FORMAT, and '\%' in both) can be encoded using URL encoding.

@lindenb I think you scared everybody with the examples (including me!!) :-) However, I should say that I like the general idea of having a single way of constructing structured...

Are you sure about the position? A quick check at Ensembl suggests that the transcript has 259 aminoacids, so the stop codon would be 260th, and thus annotated correctly: https://www.ensembl.org/Homo_sapiens/Transcript/Sequence_cDNA?db=core;g=ENSG00000205726;r=21:33642542-33735093;t=ENST00000440794

On the other hand, the actual sequence shows that the last aa L is 259th. Similar pattern is for the other transcripts as well, so I think their summary table...

It's the link in my post above https://github.com/samtools/bcftools/issues/1553#issuecomment-896994364

OK, thanks for exploring this. Can this issue be considered resolved?

In case of incomplete CDS like this, the GFF file does not give sufficient information to determine the actual protein length. I suspect even the X shown in the resources...

I don't think csi index will make any difference, the problem is most likely related to these issues https://github.com/samtools/bcftools/issues/943, https://github.com/samtools/bcftools/issues/1047. Can you try with the `--single-overlaps` option?

Sorry for the delay in responding. I looked at this again and realized what the problem is: when `annotate` transfers annotations from one file into another, it uses htslib's synchronized...