OPTIMADE
OPTIMADE copied to clipboard
No way to use IS KNOWN/IS UNKNOWN in list comparisons
While thinking about boolean representation in queries in #345, I noticed the filter language does not have provisions to use IS KNOWN
/IS UNKNOWN
in list comparisons. I think filter grammar could be extended to support HAS IS KNOWN
type queries.
The intimidate thought is to introduce a NULL constant token alongside the TRUE and FALSE proposed in #345.
However, since we dropped null
in coordinates to represent unknown coordinates, as far as I know, the only standardized use of null
in lists is in lattice_vectors. It may be a bit late to do this now, but we could consider dropping null inside lists as a concept in OPTIMADE, and only allow fields in their entirety to be either KNOWN or UNKNOWN. That would further simplify our data model, and explain why we wouldn't need a NULL constant. But I say that as someone who generally dislike the idea of embedding null
s in data of other types.
Introduction of NULL
constant seems quite natural. Many programming languages and SQL have it, so maybe we could complement IS KNOWN
/IS UNKNOWN
with syntactically simpler and more powerful NULL
constant? Surely we have to define what <
and >
operators mean for it, just like with boolean constants in #345.
However, since we dropped
null
in coordinates to represent unknown coordinates, as far as I know, the only standardized use ofnull
in lists is in lattice_vectors. It may be a bit late to do this now, but we could consider dropping null inside lists as a concept in OPTIMADE, and only allow fields in their entirety to be either KNOWN or UNKNOWN. That would further simplify our data model, and explain why we wouldn't need a NULL constant. But I say that as someone who generally dislike the idea of embeddingnull
s in data of other types.
I do not like the idea of such restriction of the data model, as I do not think all uses of NULL values in lists could be avoided. For instance, we are discussing about including CIF data in OPTIMADE, and CIF standard has two NULL-like concepts: unknown value and inapplicable value. I do not think we can avoid any of these concepts, and certainly not both.