SO-Ontologies icon indicating copy to clipboard operation
SO-Ontologies copied to clipboard

copy number assessment subtree proposal

Open mbaudis opened this issue 3 years ago • 7 comments

What is this request referring to? Result of genomic copy number assessment of a genomic element or region

What is the name you would like SO to give the term? copy number assessment and child terms

id: SO:nnnn01
label: copy number assessment
  |
  |-id: SO:nnnn02
  | label: regional base ploidy
  |   |
  |   |-id: SO:nnnn04
  |     label: copy-neutral loss of heterozygosity
  |
  |-id: SO:nnnn03
    label: copy number variation
      |
      |-id: SO:nnnn05
      | label: copy number loss
      |   |
      |   |-id: SO:nnnn07
      |   | label: low-level copy number loss
      |   |
      |   |-id: SO:nnnn08
      |     label: complete genomic deletion
      |
      |-id: SO:nnnn06
        label: copy number gain
          |
          |-id: SO:nnnn09
          | label: low-level copy number gain
          |
          |-id: SO:nnnn10
             label: high-level copy number gain
             note: commonly but not consistently used for >=5 copies on a bi-allelic genome region
              |
              |-id: SO:nnnn11
                 label: focal genome amplification
                 note: >-
                   commonly used for localized multi-copy genome amplification events where the
                   region does not extend >3Mb (varying 1-5Mb) and may exist in a large number of
                   copies

What is the definition that you would like for this term? Assessment of the copy number of a genomic feature or region, referenced to the expected allele count in the given sample. Examples of an expected count would ne:

  • autosomal chromosome in human germline: 2
  • X-chromosome in human male: 1
  • triploid cancer cell line: 3
    • i.e. a region with 2 alleles in a triploid cell line would correspond to a low-level copy number loss

Synonyms The root term would be equal to "CNV assessment" or CNV evaluation"; details for the child terms will be added while developing this proposal.

Parent Term sequence_comparison (SO: 0002072)

This seems to be the most fitting term but suggestions welcome.

Relevant Publications During the development of GA4GH Beacon v2 structural query documentation we found a lack of a consistent representation of CNV events and incomplete overlap between the concepts used in the "CNV community" (rare diseases and cancer) and SO representation. Adding @dsalgado, @ahwagner and @babisingh to the conversation.


This proposal relates to the need for the GA4GH VRS standard - but also in general for clarity about reporting CNVs - to have a documented set of terms to refer to. Note here https://github.com/ga4gh/vrs/issues/277


Updated on 2022-01-14 w/ some re-wording and addition of focal genome amplification

mbaudis avatar Dec 12 '21 12:12 mbaudis

One thing I would add to this proposal is a clear definition of what constitutes low-level gain vs amplification. I have heard amplification loosely defined as >=8 allele copies in a diploid genome. I do not have any strong preference as to what this cutoff is, only that it is clearly specified in the definition. We should seek to align with definitions from a prominent authority.

For "homozygous deletion" entry perhaps we generalize this to "complete CN loss" or similar? Homozygous as a term is strongly tied to diploid genetics.

ahwagner avatar Dec 14 '21 22:12 ahwagner

@ahwagner Great comments; supporting the "high level" statement with some literature/references is an obvious need (as are some other definitions - I just wanted to provide a draft for discussions...); and I agree w/ the complete >> homozygous (had the same feeling but didn't follow up -> waiting for voices :-)

mbaudis avatar Dec 15 '21 13:12 mbaudis

There are different cut-off values in terms of amplification (which also makes me confused):

amplification:

average genome ploidy <= 2.7 AND total copy number >= 5

OR average genome ploidy > 2.7 AND total copy number >= 9

amplification: >8 copies

amplification: >5 copies

amplification: >=5 copies

hangjiaz avatar Dec 15 '21 14:12 hangjiaz

Pinging @egchr ...

mbaudis avatar Dec 15 '21 14:12 mbaudis

@hangjiaz @ahwagner So this is rather consistent for a CN >= 5 on ploidy of ~2, w/ sometimes higher values used w/o defined baseline. However, I would just provide this as a reference, not as a prescription.

mbaudis avatar Jan 05 '22 09:01 mbaudis

I have made some changes; pls. see the updated tree ...

mbaudis avatar Jan 14 '22 09:01 mbaudis

The new tree is now reflected in EFO, including the the high-level copy number loss class added during GA4GH VRS 1.3. alignment.

https://www.ebi.ac.uk/ols4/ontologies/efo/classes/http%253A%252F%252Fwww.ebi.ac.uk%252Fefo%252FEFO_0030063?lang=en

mbaudis avatar May 19 '23 13:05 mbaudis