RNA-Bloom icon indicating copy to clipboard operation
RNA-Bloom copied to clipboard

questions about how to get genes from the output

Open alexyfyf opened this issue 11 months ago • 4 comments

Please report

  • [x] version of RNA-Bloom with java -jar RNA-Bloom.jar -version RNA-Bloom v2.0.1
  • [x] version of java with java -version openjdk version "18.0.1" 2022-04-19
  • [x] exact command used to run RNA-Bloom rnabloom -long ${FILE} -t 48 -outdir ${NAME}

Hi Ka Ming,

I'm using RNA-bloom2 to assemble long-read cDNA RNA-seq data. I have a question about the output. I can see the transcripts.fa files have the sequences for each transcripts, but how can I know which transcripts are from the same gene? I don't see that information contained in the header. Some example headers are shown here:

>rb_90719 l=1982 c=0.25546062 path=[94775+,95098+]
>rb_90720 l=407 c=0.21744472 s=103012

Also, I'm not sure why some header show s while others show path, any difference?

Thank you so much if you could help to explain it.

Cheers, Alex

alexyfyf avatar Aug 02 '23 04:08 alexyfyf

There is no inference about genes.

path indicates that it was assembled from the list of sequences from the previous step of the assembly. s indicates that it originate from a single sequence.

kmnip avatar Aug 03 '23 05:08 kmnip

Thank you so much for your reply. Are there any suggestions on how to infer genes from RNA-bloom2 output from your experience?

Cheers, Alex

alexyfyf avatar Aug 06 '23 11:08 alexyfyf

You can possibly try this: http://arthropods.eugenes.org/EvidentialGene/other/sra2genes_testdrive/sra2genes4v_testdrive/

If you are interested in a crude gene groupings of assembled transcripts, I can make it a feature request (but very low priority).

kmnip avatar Aug 09 '23 02:08 kmnip

Thank you so much. Would definitely like to have this feature in the future.

alexyfyf avatar Aug 09 '23 05:08 alexyfyf