bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

csq currently does not report the transcript id for variants found in introns if the intron line is not specified in the input GFF.

Open jel4h opened this issue 3 years ago • 3 comments

csq currently does not report the transcript id for variants found in introns if the intron line is not specified in the input GFF.

I have a GFF that includes CDS and exon lines but not intron lines. csq runs using this GFF and correctly identifies variants that fall within introns but it doesn't specify the transcript containing the intron. See example output:

10 1354437210 pol4 C GA 0.936 PASS METHOD=SEEDED;QUERY_POSITION=126092984;BCSQ=intron|Zm00001d026111||protein_coding GT:BCSQ 1|1:3

I was hoping csq would include the transcript id as it does for variants that fall within CDS regions. e.g.

10 1353466989 pol45 T C 0.936 PASS METHOD=SEEDED;QUERY_POSITION=126092764;BCSQ=synonymous|Zm00001d026111|Zm00001d026111_T001|protein_coding|-|119T|1353466989T>C GT:BCSQ 1|1:3

Is this a feature that can be added to csq? The software obviously knows where the introns and should know which transcripts they come from.

Thanks!

jel4h avatar Oct 27 '20 16:10 jel4h

This would equal to listing all transcripts, wouldn't it, since everything what is not an exon is intron. I don't see the benefit, can you elaborate?

pd3 avatar Oct 28 '20 11:10 pd3

Actually, there are often transcript-specific introns. So, it would be useful to know which transcript the intron belongs to. As you can see in the example below, this gene has 2 transcripts, and the top transcript contains a unique first intron. So, if a variant occurred in that unique intron, it'd be very useful if the output denoted the transcript containing the intron as it does for variants that fall within exons.

image

jel4h avatar Oct 28 '20 16:10 jel4h

Most of the intronic sequence is shared between all or most transcripts, so I am reluctant to blow up the records like this unless the information is essential.

I'll mark this as a feature request (to add an option to annotate introns for all transcripts), but I'll need more persuasion to do it, as it's a fairly limited use case.

pd3 avatar Oct 28 '20 17:10 pd3