RTX-KG2 icon indicating copy to clipboard operation
RTX-KG2 copied to clipboard

Update to Biolink v3.5.0

Open ecwood opened this issue 2 years ago • 12 comments

Based on https://github.com/RTXteam/RTX-KG2/issues/301#issuecomment-1610336171, we need to update to Biolink v3.5.0. Currently, we are on v3.1.2. Since we are so out of date, I am creating this issue to document some past changes that might be sticky for us.

For example, in Biolink v3.2.7, a lot changed with KEGG IRIs.

Unfortunately, Biolink v3.5.0 doesn't exist yet to start working with.

ecwood avatar Jun 27 '23 23:06 ecwood

From @saramsey:

Oh yes, so on a related note, but motivated by a different issue, we will need to start developing a file of Biolink predicates that have been deprecated since the Biolink version that xDTD is currently based on (will need to ask Chunyu), and to curate which Biolink predicates (and qualifiers, if necessary) those map to. That will require some creativitiy.

ecwood avatar Jun 28 '23 21:06 ecwood

v3.2.0: I don't see anything we have to change.

v3.2.1: There's a change with KGX, but I don't think this impacts us.

v3.2.2:

  • UniProt PURLs

v3.2.3: This doesn't impact us.

v3.2.4: This shouldn't matter to us (slot_usage information about retrieval source). Should we be using retrieval source as the category for our upstream sources, rather than information content entity? What is the difference between this and information resource?

v3.2.5: This doesn't impact us.

v3.2.6: This doesn't impact us.

v3.2.7:

  • We might care about the fact that in_taxon_label is now a node property.
  • KEGG prefixes and OBO prefixes different

v3.2.8:

  • Change to RO ID of colocalizes with
  • Change to domain/range of actively involved in, actively involves, inputs, and outputs
  • Change to RO mapping of results in movement of

v3.3.0:

  • Changes to SemMedDB mappings
  • Change to location of genetic association

v3.3.1:

  • Changes to domain/range of catalyzes

v3.3.2: This does not impact us.

v3.3.3: This shouldn't impact us. It looks like internal checking of the biolink model, but if they found any bad URLs, we might have to change ours.

v3.3.4:

  • Change to infores storage format (TSV to YAML) - this might change validation?

v3.4.0:

  • Modified gene to disease associations (now there's causative and correlated)

v3.4.1: This release doesn't impact us. In fact, it looks like within the model, it went from 3.4.0 to 3.4.2 (https://github.com/biolink/biolink-model/pull/1327/files#diff-e84866df1772b9b92474ba97eeacbf344265d3285db33b3459f4e794d1de24c5L28-R28)

v3.4.2:

  • GeneToPhenotypeAssociation to GeneToPhenotypicFeatureAssociation

v3.4.3:

  • in taxon label is now a slot on thing with taxon - I'm not sure if this impacts us, but reminded me of v3.2.7

ecwood avatar Jun 29 '23 01:06 ecwood

@ecwood Thank you! Any thoughts about the 3.3.X and 3.4.X releases?

saramsey avatar Jun 29 '23 16:06 saramsey

@ecwood Thank you! Any thoughts about the 3.3.X and 3.4.X releases?

@saramsey Yes! I updated the comment with information about the other releases.

ecwood avatar Jun 29 '23 20:06 ecwood

Thanks so much. I used the master branch of biolink-model.yaml to guide the changes that I made to predicate-remap.yaml for #305. Hopefully that helps move us toward 3.5.0 compliance.

saramsey avatar Jun 30 '23 20:06 saramsey

I expected that we'd need to change the UniProt URL. However, run-validation-tests.sh on the master branch is not failing.

@saramsey Do you know why it (specifically validate_curies_to_urls_map_yaml.py) wouldn't fail despite these lines being different?

https://github.com/RTXteam/RTX-KG2/blob/9aa895416f9ff7abbae5fb699869244ea908a2c0/curies-to-urls-map.yaml#L716-L717

https://github.com/biolink/biolink-model/blob/5b9c2834e6ae548f65f1819fb09e390e7aa3f307/biolink-model.yaml#L143:

  UniProtKB: 'http://purl.uniprot.org/uniprot/'

There's a similar situation with KEGG:

https://github.com/RTXteam/RTX-KG2/blob/9aa895416f9ff7abbae5fb699869244ea908a2c0/curies-to-urls-map.yaml#L246-L257

https://github.com/biolink/biolink-model/blob/5b9c2834e6ae548f65f1819fb09e390e7aa3f307/biolink-model.yaml#L87-L91:

  KEGG.BRITE: 'https://bioregistry.io/kegg.brite:'
  KEGG: 'http://www.kegg.jp/entry/'
  KEGG.GENES: 'https://bioregistry.io/kegg.genes:bsu:'
  KEGG.PATHWAY: 'https://bioregistry.io/kegg.pathway:'
  KEGG.RCLASS: 'https://www.genome.jp/dbget-bin/www_bget?rc:'

ecwood avatar Jun 30 '23 20:06 ecwood

Strange that the validation tests don't mind the differences in the URLs. I don't know why that would be. Just to remind myself because I've been out so much, in addition to the changes above to catch up to the current biolink model, we are also addressing issue #281, adding a domain_range_exclusion boolean type property to edges.

@ecwood , do you have a preference on where I get started on the changes?

acevedol avatar Jul 05 '23 16:07 acevedol

@acevedol Some of the changes have already been made (#306), and all of that has been done in kg2.8.4-prep, so let's stick with that.

we are also addressing issue https://github.com/RTXteam/RTX-KG2/issues/281, adding a domain_range_exclusion boolean type property to edges

Yes, but unfortunately, we don't have the schema for how we will get this data yet, so it's not much use to start the code for it yet.

ecwood avatar Jul 05 '23 16:07 ecwood

While testing out bc079cf (to ensure the validation scripts are using the correct version of biolink), we got this error:

Reading ontology JSON file: /home/ubuntu/kg2-build/biolink-model.owl.json; size: 2213.82 KiB
Traceback (most recent call last):
  File "/home/ubuntu/kg2-code/validate_kg2_util_curies_urls_categories.py", line 62, in <module>
    assert category_curie in biolink_categories_ontology_depths, category_curie
AssertionError: biolink:InformationResource

It looks like information resource was removed in Biolink v3.3.4. See https://github.com/biolink/biolink-model/commit/c24d4433b82ab83372824f31e5fd543d670fa237 and https://github.com/biolink/biolink-model/commit/e2207e18ee8e5a0d7fa096977858f2d22cdb3577.

ecwood avatar Jul 05 '23 19:07 ecwood

With 8a63bbc, the validator is not failing anymore. That being said, there are still hundreds (possibly thousands) of errors like this:

2023-07-05 20:34:47,917 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/enabled_by> "prevented by"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,920 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/prevents> "predisposes"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,922 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/decreases_response_to> "increases response to"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,923 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/broad_match> "narrow match"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,923 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/enables> "prevents"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,924 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/has_increased_amount> "has decreased amount"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,924 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/has_output> "has input"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,924 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/predisposes> "prevents"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,925 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/increases_response_to> "decreases response to"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,925 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/contraindicated_for> "treats"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')
2023-07-05 20:34:47,926 ERROR (OWLAnnotationPropertyTransformer:98) Attempt to transform an axiom to correct misuse of properties failed. Property replacement: {rdfs:label=rdfs:label, <https://w3id.org/biolink/vocab/opposite_of>=<https://w3id.org/biolink/vocab/opposite_of>}, axiom: AnnotationAssertion(<https://w3id.org/biolink/vocab/opposite_of> <https://w3id.org/biolink/vocab/has_input> "has output"), error: class uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression cannot be cast to class org.semanticweb.owlapi.model.IRI (uk.ac.manchester.cs.owl.owlapi.OWLLiteralImplNoCompression and org.semanticweb.owlapi.model.IRI are in unnamed module of loader 'app')

ecwood avatar Jul 05 '23 20:07 ecwood

Per @saramsey, validate_curies_to_urls_map_yaml.py is validating CURIEs only, not the URLs. In an enhancement, we should have the script compare the URLs to what is in Biolink.

ecwood avatar Jul 13 '23 16:07 ecwood

Now that #319 and #320 have been finished, tested, and are passing, do we need to do anything else to verify that we are Biolink 3.5.0 compliant? Do we need to do #306 to verify compliance?

ecwood avatar Jul 13 '23 22:07 ecwood