TOGA icon indicating copy to clipboard operation
TOGA copied to clipboard

Post-TOGA queries related to UL (Uncertain loss) and L (loss) status

Open vinitamehlawat opened this issue 5 months ago • 2 comments

Hello Dr. @MichaelHiller

Greetings, I have some post toga queries:

I am struggling with "L" and "UL" dataset. Please consider the following tree, where my research interest in not a single species, but rather a whole lineage(branch where I pointed arrow). I am interested into common set of gene which are lost (L) at that branch.

TOGA_tree_image

  1. If I consider all "L" at that branch I am skipping lot of data, means if genes which are, lets say out of 14 in-group species in 11 they are Lost but in other 3 in-group that same gene is UL. Is it okay to say that gene is lost ?
  2. Another If gene is clearly "L" in TOGA output, is it necessary to check in transcriptome data if that gene is transcribing or not or based on TOGA robustness we can say that lost is "Clear Lost" means no functional protein for that gene?
  3. For "UL", I went through some discussion over TOGA GitHub issues but still I am not clear for their status; Is it right to say that if 1 gene have 10 transcript, out of 10 that could be possible that I can get true hit in transcriptome for 6, if they have inactivation mutation but not full filling the "loss" criteria
  4. What could be the possible way to analyze in more detail for "UL" data , transcriptome dataset or RELAX selection test.
  5. In provided sample tree, if gene is Intact till Query3 and then at my focal branch in all species that gene is either L or UL but this is gene NOT Intact, What would be the best possible status of that gene in your thoughts?

I know, this is too much to ask, But I really appreciate your thoughtful suggestions and they will really help me to sort my data in more logical way.

Looking forward to hear from you Best Regards Vinita

vinitamehlawat avatar Sep 18 '24 01:09 vinitamehlawat