OPTIMADE
OPTIMADE copied to clipboard
Extended filtering on relationships
Currently we expected relationships filtering to be a two-step process:
Note: formulating queries on relationships with entries that have specific property values is a multi-step process. For example, to find all structures with bibliographic references where one of the authors has the last name "Schmit" is performed by the following two steps:
- Query the references endpoint with a filter authors.lastname HAS "Schmit" and store the id values of the returned entries.
- Query the structures endpoint with a filter references.id HAS ANY <list-of-IDs>, where <list-of-IDs> are the IDs retrieved from the first query separated by commas.
In my implementation, I would like to support doing this in one-step, via e.g.,
/structures?filter=references.doi = "10.1234/12345"
.
This seems like a trivial extension to the specification (implementations MAY support relationship filtering via...). The only conflict would be the special description
field we added for relationship filtering. I have not seen anyone using this, but we can maintain compatibility by reserving it as a keyword and never using description
as an attribute name for any entry type (so that references.description
is always unambiguously referring to the relationship description).
Am I missing some technical reason that we can't allow this syntax as optional, or was the issue that to be able to handle this robustly across different implementations we have to use the two-step process?
I think the idea was to limit querying to the information which is normally returned within the response. But I do not see why optional support past that could not be allowed. The exceptional handling of description
might be problematic, this is too good a property name to forbid :smile:
This is made more relevant by the potential future use cases for /calculations
, with our current approach it would be impossible to find relationships with calculations that specifically calculate some property, whereas with this we could do e.g. /structures?filter=calculations._my_scan_band_gap<0.5
This issue has resurfaced today in the workshop in the same context of filtering structures on the results of their calculations. Thus I think it deserves to have its severity bumped.
With the advent of /files
endpoint, description
is now a property in files
entry type. Thus we cannot forbid description
as a property name anymore. I guess for OPTIMADE v1.x we can retain this special handling of meta.description
of relationships, with minimal loss. In any case files.description
is supposed to be a human-readable string, most likely not that useful in queries.
I guess for OPTIMADE v1.x we can retain this special handling of
meta.description
of relationships, with minimal loss. In any casefiles.description
is supposed to be a human-readable string, most likely not that useful in queries.
... or we can attempt dropping the special provision for meta.description
as something that was rarely used. I can draft a PR for the extended filtering. Let us collect the opinions about meta.description
issue there.