neat-ml
neat-ml copied to clipboard
In link prediction, filter nodes by prefix or other slots
Some graphs have nodes we would like to filter for, but they don't make clear distinctions in their Biolink categories:
PR:000002977 biolink:NamedThing Graph owl:Class
So we would like to specify a filter for prefix rather than category.
This can be based on a flag used in the link_node_types:
block in the config.
Similarly, it would be nice to be able to filter by other node slots/properties:
XPO:0134172 biolink:NamedThing increased apoptosis in simple columnar epithelium An increased occurrence of apoptotic process in simple columnar epithelium. Graph
This could be as simple as a regex for a string value in a named column, e.g., match everything with the string "apoptosis"
@LucaCappelletti94 you may have already solved this problem in terms of filtering graph nodelists by CURIE prefix and mapping it to a namespace
In ensmallen it is possible to filter by the prefix, but I do not know what you mean by mapping it to a namespace
.
Same thing as far as we're concerned - namespace == prefix , at least as far as node IDs go.
Ok, then graph.filter_from_names(...)
has all of the kwargs you may desire for this sort of goal. It should be available in the latest nightly build if I am not mistaken (0.7.0.dev20
).