RTX-KG2 icon indicating copy to clipboard operation
RTX-KG2 copied to clipboard

Biolink v4.2.0 Incompatibilities

Open ecwood opened this issue 1 year ago • 5 comments

The failure of the CI run for 55e2c16 suggests that the predicate mappings are not up-to-date with Biolink v4.2.0:

+ /home/runner/kg2-venv/bin/python3 -u -u /home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/validate_predicate_remap_yaml.py /home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/curies-to-urls-map.yaml /home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/predicate-remap.yaml https://raw.githubusercontent.com/biolink/biolink-model/v4.2.0/src/biolink_model/schema/biolink_model.yaml /home/runner/kg2-build/biolink_model.yaml
/home/runner/kg2-venv/lib/python3.7/site-packages/rdflib_jsonld/__init__.py:12: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.0.  Please remove rdflib-jsonld from your project's dependencies.
  DeprecationWarning,
Traceback (most recent call last):
  File "/home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/validate_predicate_remap_yaml.py", line 195, in <module>
    f"{relation} should map to {allowed_biolink_curies_set} ({mapping_term_used.split('_')[0]})"
AssertionError: SEMMEDDB:ADMINISTERED_TO should map to {'biolink:applied_to_treat'} (broad)

ecwood avatar Jun 26 '24 00:06 ecwood

I'm not sure what to do about SEMMEDDB:TREATS, since it is mapped in Biolink to a mixin.

+ /home/runner/kg2-venv/bin/python3 -u -u /home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/validate_predicate_remap_yaml.py /home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/curies-to-urls-map.yaml /home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/predicate-remap.yaml https://raw.githubusercontent.com/biolink/biolink-model/v4.2.0/src/biolink_model/schema/biolink_model.yaml /home/runner/kg2-build/biolink_model.yaml
/home/runner/kg2-venv/lib/python3.7/site-packages/rdflib_jsonld/__init__.py:12: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.0.  Please remove rdflib-jsonld from your project's dependencies.
  DeprecationWarning,
Traceback (most recent call last):
  File "/home/runner/work/RTX-KG2/RTX-KG2/RTX-KG2/validate_predicate_remap_yaml.py", line 165, in <module>
    assert core_predicate not in biolink_mixins, (relation, core_predicate, {'Mixins': biolink_mixins})
AssertionError: ('SEMMEDDB:TREATS', 'biolink:treats_or_applied_or_studied_to_treat', {'Mixins': ['biolink:interacts_with', 'biolink:increases_amount_or_activity_of', 'biolink:decreases_amount_or_activity_of', 'biolink:chemical_role_mixin', 'biolink:biological_role_mixin', 'biolink:promotes_condition', 'biolink:treats', 'biolink:treated_by', 'biolink:treats_or_applied_or_studied_to_treat', 'biolink:subject_of_treatment_application_or_study_for_treatment_by', 'biolink:chemical_entity_or_drug_or_treatment']})

ecwood avatar Jun 26 '24 01:06 ecwood

Per clarification from Sierra Moxon, use of mixin: true predicates in Biolink directly as predicates in triples is now allowed. So we can relax the biolink mixin check, I believe. The separate TRAPI validator may still be complaining about it, but I have checked with Sierra more than once to confirm that use of mixins (and will continue to be) allowed. It just may take some time for the TRAPI validator to be updated to reflect that. And of course, we'll want to update our valiadator in validate_predicate_remap_yaml.py. Thank you!!

saramsey avatar Jun 27 '24 15:06 saramsey

For SEMMEDDB:administered_to and SEMMEDDB:ADMINISTERED_TO, I favor the more generic biolink:treats_or_applied_or_studied_to_treat since I suspect a lot of edges picked up by SemMedDB will actually be investigational (i.e., "we tried administering silvadene creme to eczema lesions" or whatever) rather than clinical practice.

For SEMMEDDB:associated_with, yes, biolink:associated_with looks appropriate. Thank you!!

saramsey avatar Jun 27 '24 15:06 saramsey

For SEMMEDDB:administered_to and SEMMEDDB:ADMINISTERED_TO, I favor the more generic biolink:treats_or_applied_or_studied_to_treat since I suspect a lot of edges picked up by SemMedDB will actually be investigational (i.e., "we tried administering silvadene creme to eczema lesions" or whatever) rather than clinical practice.

I agree completely. However, Biolink has ruled that it maps to biolink:applied_to_treat. Should we add an exception in our validator for it? Or reach out to Biolink?

ecwood avatar Jun 27 '24 18:06 ecwood

Per my meeting with Steve today, I should add an exception for SEMMEDDB:administered_to and SEMMEDDB:ADMINISTERED_TO. I did this in b1d7501.

ecwood avatar Jun 28 '24 03:06 ecwood