mixs icon indicating copy to clipboard operation
mixs copied to clipboard

Replace string serialisation with enum for HostSex categories

Open jfy133 opened this issue 1 year ago • 1 comments

Additional questions I have:

  • Is there a CHANGELOG I should update?
  • Should I remove the annotation that the expected value is an enumeration?
  • Is there any other areas of the file I should update?
  • Are there any other files I need to update?
  • Do I need to run any other scripts or generate anything else?

jfy133 avatar Aug 15 '24 08:08 jfy133

see also

  • https://github.com/GenomicsStandardsConsortium/mixs/pull/840

which includes more repairs along the same lines, but also introduces excessive formatting changes

I recommend merging this as soon as possible

turbomam avatar Aug 20 '24 18:08 turbomam

@turbomam what is the merging protocol here, may I do this when I have an approval? Or do I lead it to a senior project member?

jfy133 avatar Dec 09 '24 12:12 jfy133

Thanks, removing strings_serializations is a high priority!

@jfy133 thanks for your contribution. I don't think you need any one senior person to approve this, but the decision to merge is usually made live, during a TWG or CIG meeting on Tuesday.

I will advocate for this, but posting in the GSC slack and adding an agenda item to either or both of the meetings notes files will help prioritize your PR

turbomam avatar Dec 09 '24 18:12 turbomam

Thanks, removing strings_serializations is a high priority!

@jfy133 thanks for your contribution. I don't think you need any one senior person to approve this, but the decision to merge is usually made live, during a TWG or CIG meeting on Tuesday.

I will advocate for this, but posting in the GSC slack and adding an agenda item to either or both of the meetings notes files will help prioritize your PR

* [GSC Technical WG Meeting Notes 2024](https://docs.google.com/document/d/1MG9JBj9m8Lnev7UBnPGpbQO9ReovswASGNouidjmfx4/edit?tab=t.0#heading=h.2989lvv9mqv5)

* [CIG Running Notes (tomorrow)](https://docs.google.com/document/d/19CWWf1oqMlyH7prteVC5k4eYF_JzJbNqNcvGUyX_U50/edit?tab=t.0#heading=h.mget0ilzdhks)

Great thank you! I can't attend the CIG meeting but I've left an agenda point anyway

jfy133 avatar Dec 10 '24 10:12 jfy133

Thanks @jfy133. The LinkML implementation here is the ideal outcome.

@pbuttigieg points out that the composition of this enumeration is a case of semantic injection

@mslarae13 points out that the description includes the word "gender"

@turbomam will dig up the NCBI values for this and other sex and gender terms/slots

Do we need to split this out into biological sex and gender terms?

We should include out sources in the LinkML model. @pbuttigieg found this visual for exploring the societal and chromosomal bases of sex and gender. Might not have enough emphasis on developmental outcome of producing gametes of a particular type.

https://docs.google.com/document/d/19CWWf1oqMlyH7prteVC5k4eYF_JzJbNqNcvGUyX_U50/edit?tab=t.0#heading=h.s0d46i1lne1i

https://static.scientificamerican.com/sciam/cache/file/164FE5CE-FBA6-493F-B9EA84B04830354E_source.jpg

turbomam avatar Dec 10 '24 16:12 turbomam

More references:

  • http://static.scientificamerican.com/sciam/cache/file/164FE5CE-FBA6-493F-B9EA84B04830354E_source.jpg?w=1200
  • https://pubmed.ncbi.nlm.nih.gov/14745830/
  • https://en.m.wikipedia.org/wiki/ZW_sex-determination_system

turbomam avatar Dec 10 '24 16:12 turbomam

Ideal outcome: two clearly distinguished terms

  • sex term
  • gender term

turbomam avatar Dec 10 '24 16:12 turbomam

https://en.m.wikipedia.org/wiki/XO_sex-determination_system

And in general for HostSex (which can be disaggregated) value spaces https://en.m.wikipedia.org/w/index.php?title=Sex-determination_system

pbuttigieg avatar Dec 10 '24 16:12 pbuttigieg

For HostGender, we should check out and evaluate https://www.ebi.ac.uk/ols4/ontologies/gsso

pbuttigieg avatar Dec 10 '24 16:12 pbuttigieg

Coming from an anthropological background, glad to see the proposal to split it up!

I was also somewhat uncomfortable with fixing the slot for this reason, but continued did so for purely technical reasons.

Let me know if you need any further help. Feel free to close the PR if splitting the term is of high priority and the term will be replaced.

jfy133 avatar Dec 10 '24 18:12 jfy133

@turbomam I think we should close this and consolidate issues on this theme, including differenting Host vs Organism Sequenced.

Ideal outcome: two clearly distinguished terms

  • sex term
  • gender term

I think we'll need more to capture the main axes here.

The likely fields will include fields that are

  • chromosome based (sex determination),
  • development based (accounting for DSDs, see Conditions for example enumeration),
  • developmental intervention based (including surgical and pharmacological, where relevant and permissable by data protection regulations, relevant for various microbiomes e.g. here, here ,and here)
  • behaviour based (proxied here by gender presentation and/or self identification)

Additional anatomical sites or variants of existing MIxS anatomical groupings will also be needed (e.g. here)

pbuttigieg avatar Dec 17 '24 15:12 pbuttigieg

xref #838 #517

pbuttigieg avatar Dec 17 '24 15:12 pbuttigieg

I've linked this PR in the issue above for documentation, so will close this as suggested by @pbuttigieg

jfy133 avatar Dec 18 '24 11:12 jfy133