pyGenomeViz icon indicating copy to clipboard operation
pyGenomeViz copied to clipboard

Cannot visualise rRNA units from gff file

Open Edouard94 opened this issue 9 months ago • 3 comments

Hello,

I am trying to visualise my rRNA units with the following script:

# Limited Range Features

from Bio.SeqFeature import SeqFeature
from pygenomeviz import Gff, GenomeViz
import matplotlib.pyplot as plt

# Load GFF file
gff_file = "Gb_microsporidia_assemblyGFF.gff"
gff = Gff(gff_file, min_range=1, max_range=170000)

# Plot min-max range GFF features with sublabel
gv = GenomeViz(fig_track_height=0.7, tick_style="bar")
track = gv.add_feature_track(name=gff.name, size=gff.range_size, start_pos=gff.min_range, labelsize=0)
track.add_gff_features(gff, feature_type="CDS", facecolor="skyblue", linewidth=0.5, label_type="gene", labelvpos="top")
track.add_gff_features(gff, feature_type="rRNA", facecolor="lime", linewidth=0.5, label_type="product", labelvpos="top", patch_kws=dict(hatch="/"))
track.set_sublabel(f"{gff.name}\n\n{gff.min_range} - {gff.max_range} bp", ymargin=1.0)

fig = gv.plotfig()

# Save the figure
plt.savefig("genome_range_plot.png", dpi=1000)

but it was not working as I think my gff format is no quite the one expected. Here are the location of the rRNA units in the gff file:

CONTIG_177_POLISHED	EMBL/GenBank/SwissProt	rRNA	162154	166014	.	-	.	note "rRNA unit" 
CONTIG_181_POLISHED	EMBL/GenBank/SwissProt	rRNA	4102	7964	.	-	.	note "rRNA unit" 
CONTIG_293_POLISHED	EMBL/GenBank/SwissProt	rRNA	1	734	.	-	.	note "rRNA unit"

Any idea on how I could plot them with the track.add_gff_features or feature.qualifiers.get functions.

Thank you for your help!

Edouard94 avatar Apr 29 '24 16:04 Edouard94

Hi @Edouard94,

I'm not sure what visualization results you expect to get. Were you expecting the three rRNAs listed in the example GFF file to be plotted on one track ranging from 0 - 170,000bp?

If the above assumption is correct, then the idea of ​​plotting rRNAs present in different contigs on one track is wrong, so it is natural that the plot will not be as you expected.

moshi4 avatar Apr 30 '24 02:04 moshi4

Hello @moshi4,

Thank you for your reply, yes I can understand that having multiple rRNAs units is wrong; I wanted to plot this surprising annotation and try to visualise the placement of these units along my genome.

But I also retry to plot only one rRNA unit (located between 1 and 1000), and again the expected 'lime' unit is not produced, is this because of my gff format file? (see my first comment)

I used the same code as before, and just changed the range from 1 to 1000 bp.

genome_range_plot

Edouard94 avatar May 04 '24 12:05 Edouard94

Sorry, I don't quite understand what you are trying to do, so it is difficult to answer. At least the GFF file is reading correctly, so I think the format is correct. It seems to me that you are not understanding the previous comment correctly, so please take a look at the following code example to try to understand what I mean.

Code Example

GCF_000959065.1.gff.gz

from pygenomeviz import Gff, GenomeViz

gff_file = "GCF_000959065.1.gff.gz"
gff = Gff(gff_file)

gv = GenomeViz()

for seqid, size in gff.get_seqid2size().items():
    track = gv.add_feature_track(seqid, size)
    track.set_sublabel()
    cds_features = gff.get_seqid2features("CDS")[seqid]
    track.add_features(cds_features, facecolor="skyblue", plotstyle="arrow")
    rrna_features = gff.get_seqid2features("rRNA")[seqid]
    track.add_features(rrna_features, facecolor="lime", plotstyle="arrow")

gv.savefig("example01.png")

example01.png

example01

from pygenomeviz import Gff, GenomeViz

gff_file = "GCF_000959065.1.gff.gz"
target_seqid = "NZ_LAEX01000007.1"
gff = Gff(gff_file, target_seqid=target_seqid, min_range=50000, max_range=100000)

gv = GenomeViz()

track = gv.add_feature_track(target_seqid, gff.range_size, start_pos=gff.min_range)
track.set_sublabel()
cds_features = gff.extract_features("CDS")
track.add_features(cds_features, facecolor="skyblue", plotstyle="arrow")
rrna_features = gff.extract_features("rRNA")
track.add_features(rrna_features, facecolor="lime", plotstyle="arrow")

gv.savefig("example02.png")

example02.png

example02

moshi4 avatar May 04 '24 14:05 moshi4