mixs icon indicating copy to clipboard operation
mixs copied to clipboard

update HACCP_term regex to required FOODON, add multivalue example

Open mslarae13 opened this issue 1 year ago • 6 comments

Address syntax match to examples Update regexs for MIxS

Based on the description for HACCP this requires the FOODON ontology. description: Hazard Analysis Critical Control Points (HACCP) food safety terms; This field accepts terms listed under HACCP guide food safety term (http://purl.obolibrary.org/obo/FOODON_03530221)

While this doesn't perform any validation to check if what's been entered is really in FOODON, it does some string check.

mslarae13 avatar Jun 14 '24 18:06 mslarae13

I didn't include an example. I am not at all familiar with the FoodAnimalAndAnimalFeed extension. Before I committed time to getting familair and making an example, I wanted to check that this was a good change.

mslarae13 avatar Jun 28 '24 17:06 mslarae13

Thanks @mslarae13. This is good progress. We can refine it a little:

First of all, how long are the numeric portions of FOODON URIs?

I used ChatGPT 4 to help me with that SAPRQL query

7 or 8, after subtracting the 38 characters in the base portion or the URIs, "http://purl.obolibrary.org/obo/FOODON_"

Next I asked ChatGPT 4

I want to write a regular expression for a FOODON label followed by one white-space and then a FOODON CURIe. The CURIes should be enclosed in square brackets. They start with "FOODON:" and are followed by 7 or 8 digits. The label must start with a non-white-space character but can have any number of any characters after that, as long as they aren't carriage returns, line feeds, etc.

after a little testing with regexr, we came up with

^(\S[^\r\n]*) [FOODON:\d{7,8}]$

I f we want to use pattern-only validation, I suggest we go with that.

turbomam avatar Jul 09 '24 17:07 turbomam

That doesn't check that the label and id portion match, etc., and it doesn't limit the choices to sub-classes of haccp guide food safety term

A better LinkML validation strategy for this might be a dynamic enumeration. They are expressed with logic, but can be expanded to an enumeration with explicit permissible values. A limitation right now is that be that the permissible values won't include the label and the id won't be enclosed in square brackets. But I would like to use this case to motivate improvements to LinkML dynamic enumerations in support of MIxS.

turbomam avatar Jul 09 '24 18:07 turbomam

The vskit command from the Ontology Access Kit can be used like this

vskit expand -s schema.yaml -o schema_expanded.yaml

to expand this

enums:
  HaccpTerm:
    reachable_from:
      source_ontology: bioregistry:foodon
      source_nodes:
      - FOODON:03530221   ## haccp guide food safety term
      is_direct: false
      relationship_types:
      - rdfs:subClassOf

into this

enums:
  HaccpTerm:
    reachable_from:
      source_ontology: bioregistry:foodon
      source_nodes:
      - FOODON:03530221   ## haccp guide food safety term
      is_direct: false
      relationship_types:
      - rdfs:subClassOf
    permissible_values:
      FOODON:03530231:
        text: FOODON:03530231
        meaning: FOODON:03530231
        title: hazard 3
      FOODON:03530244:
        text: FOODON:03530244
        meaning: FOODON:03530244
        title: sodium tripolyphosphate
      FOODON:03530237:
        text: FOODON:03530237
        meaning: FOODON:03530237
        title: hazard 9

turbomam avatar Jul 09 '24 18:07 turbomam

If using this mechanism sounds promising to you, and you want the OAK code to be modified to emit "sodium tripolyphosphate [FOODON:03530244]" instead of "FOODON:03530244", please up-vote this

  • https://github.com/INCATools/ontology-access-kit/issues/622

turbomam avatar Jul 09 '24 18:07 turbomam

I agree that the change is suitable. As for the actual patturn being used, I bow to @turbomam's greater expertise on that! The additional idea of using some sort of automated expansion thingy sounds like a good idea to me, so I have thumbs-up'd that ticket in the ontology access toolkit repo.

only1chunts avatar Jul 10 '24 08:07 only1chunts