GFF3toolkit icon indicating copy to clipboard operation
GFF3toolkit copied to clipboard

inquiry about "gff3_to_fasta -st user_defined -u mRNA CDS"

Open yanzhongsino opened this issue 2 years ago • 5 comments

Thank you for your development of the great gff3 tools.

While I use "gff3_to_fasta -st user_defined -u mRNA CDS" to get fasta file, it seems to get child (this is CDS) fasta. I'm just wondering about how to get parent fasta (this is mRNA) in this situation.

Any reply will be welcome.

yanzhongsino avatar Apr 22 '22 09:04 yanzhongsino

Hi @yanzhongsino. How is the mRNA modeled? If there are exon features in addition to CDS, I'd recommend using gff3_to_fasta -st user_defined -u mRNA exon. But that really depends on how the data is modeled in your gff3 file. You can send along a snippet if that helps.

mpoelchau avatar Apr 22 '22 19:04 mpoelchau

Thanks! There are only mRNA and CDS features in the gff3 file which was downloaded from published paper. This is a snippet. " LG01 maker mRNA 883182 884411 . - . ID=DR000001;Source=MAKER1:DHR000398.1,MAKER2:DH000416.1; LG01 maker CDS 884253 884411 . - 0 Parent=DR000001; LG01 maker CDS 883311 883392 . - 0 Parent=DR000001; LG01 maker CDS 883182 883219 . - 2 Parent=DR000001; LG01 maker mRNA 884947 886421 . + . ID=DR000002; LG01 maker CDS 884947 885114 . + 0 Parent=DR000002; LG01 maker CDS 885378 885476 . + 0 Parent=DR000002; LG01 maker CDS 885572 885723 . + 0 Parent=DR000002; LG01 maker CDS 886034 886421 . + 1 Parent=DR000002; "

I find a way to get parent fasta. First, I change all mRNA tag to gene in gff3 file. Then, I used "gff3_to_fasta -st gene" to get "gene" fasta(this is mRNA in old gff file). If there are any other smooth way, please tell me.

yanzhongsino avatar Apr 25 '22 01:04 yanzhongsino

@yanzhongsino when you say that you only got the CDS fasta at first - do you mean that you only got the sequences for the individual CDS segments (e.g. in the above example for DR000001, you would get 3 separate fastas for each individual CDS line)? Or did you get one fasta sequence for the entire CDS?

mpoelchau avatar Apr 25 '22 22:04 mpoelchau

I got one fasta sequence for the entire CDS in the first case.

yanzhongsino avatar Apr 27 '22 03:04 yanzhongsino

Thanks for the response @yanzhongsino (and sorry about the delay). I'm curious - are the nucleotide sequences that you got from the different commands this same (mRNA vs. CDS)?

mpoelchau avatar May 10 '22 18:05 mpoelchau