is-04
is-04 copied to clipboard
Query language enhancements targeting minimal clients
I'm noting a few items here which I've previously discussed with @garethsb-sony just so they're noted down somewhere more public. These are just ideas at present and have not yet been tested out.
Paging limit for WebSockets
At present pagination is not allowed for Query API WebSocket subscriptions. By permitting the paging.limit
query parameter, minimal clients could restrict the maximum message size a Query API could send. Pairing this with the existing max_update_rate_ms
parameter would provide greater guarantees over message rates.
RQL queries for related records
When writing a client such as a connection manager you can easily filter a single resource type such as Senders. Finding the Flows which relate to these filtered Senders is much harder and typically requires the Flows to be addressed individually, or the entire Flow collection to be consumed. By adding a rel
(related) RQL query parameter this could be assisted as follows:
/flows?query.rql=rel(senders,matches(transport,urn%3Ax-nmos%3Atransport%3Artp))
The query string above would return only the Flows where the related Senders match a particular query. This could be used via the REST API and via WebSocket subscriptions.
Restarting WebSocket subscriptions after a disconnection If a WebSocket connection is interrupted, the client must create a new WebSocket subscription and consume the initial 'sync' message containing all data matching its query. Whilst this works fine, an optimisation would be to make use of the Query API's paging cursors at subscription creation time. By doing this the Query API could be informed of the most recent change the client is aware of and only pass on changes from that point, avoiding the larger initial sync message.
Paging limit for WebSocket using paging.limit
is implemented as an extension in the OSS nmos-cpp-registry and has been used successfully to enable Query WebSocket clients whose WebSocket engines don't support configuration of the maximum received message size.
RQL queries for related records using rel
(which is minimally described in some of the RQL specs) seems like a good idea. Using the related resource type name as the relation reads well in most cases such as the example given, and could be nested e.g. to get the Sources associated with Senders and Flows matching some criteria), but I wonder if we need to be explicit about the resource type and property that encode the relation?
E.g. in the example given it is the Senders' flow_id
that encodes the relation. Can we make rel
queries using the same relation from the other direction? How should Flow parents be requested? Can we request Flow children? How about general Flow ancestor/descendant queries?
Restarting WebSocket subscriptions after a disconnection using paging.since
has been demonstrated to be highly beneficial especially when restarting connections on huge Registries (10,000+ resources). After a brief disconnection, zero or one message may logically be enough to rejoin, whereas the current spec requires ~10 MB of data to be transmitted.
However, implementing this relies on the Client knowing the appropriate value, i.e. that each Query WebSocket message should have a value equivalent to an X-Paging-Until
response header. This has proven a little difficult to specify, especially in interaction with the paging.limit
extension also described above.
RQL queries for related records
Copying in the thoughts we have had on rel
syntax and semantics...
My goal is to not define the supported relations for each resource type out-of-band, but to define relations using the existing JSON property definitions.
Syntax
Basic syntax: rel(<relation>, <call-operator>)
Result: the same result type as the <call-operator>
, i.e. usually bool
.
-
Forward references
i.e. where<relation>
is a<property>
of the queried resource type that is equal to anid
of a related resource typeExample:
senders ?query.rql= and( or( eq(transport,urn%3Ax-nmos%3Atransport%3Artp), eq(transport,urn%3Ax-nmos%3Atransport%3Artp.mcast) ), rel(flow_id, eq(format,urn%3Ax-nmos%3Aformat%3Avideo) ) )
-
Backward references
i.e. where<relation>
is from a property of another resource type that is equal to anid
of the resource type being queriedExample:
flows?query.rql=rel(senders%3Fflow_id,eq(transport,urn%3Ax-nmos%3Atransport%3Artp))
- initially I proposed to represent the
<relation>
by the string<resource>?<property>
but?
must unfortunately be percent-encoded because it isn't directly allowed by thenchar
production used in RQL -
.
would be confused with nested property syntax;:
isn't allowed because it's used to distinguish thetyped-value
production; all of/
,$
and@
also require percent-encoding; in fact the only punctuation chars besides.
that are allowed unencoded are*+-_~
which all seem awkward - another alternative would be to use the RQL
array
production here, i.e.(<resource>, <property>)
, or simply a three-argument call-operator 'overload', i.e.rel(<relation-resource>, <relation-property>, <call-operator>)
?
- initially I proposed to represent the
This so far only accounts for references via id
properties. Maybe we want to support references via other identifier properties such as between Source clock_name
and Node clocks.name
. Another example where this could be very useful is in the relation between Sender or Receiver interface_bindings
and Node interfaces.name
.
However, this is not as simple as it seems since those identifiers are only unique within the same Node, and select a sub-object of the Node resource. An expression using clock_name
deep within rel(device_id,rel(node_id,<call-operator>)
might be possible, but would require the <call-operator>
to be able to accomplish comparison of the clock_name
from the outer 'scope' with the clock.name
found in the inner 'scope'.
Therefore this is currently not supported in this proposal, and could be considered as a reason for defining the supported relations independently of the existing JSON property definitions instead.
Semantics
One way of describing how relations behave is by transforming them to sub-queries.
In general, the rel
call-operator in a query like {resourceType}?query.rql=rel(<relation>,<call-operator>)
may be transformed into a new query {relatedType}?query.rql=and(<related-property-call-operator>, <call-operator>)
, where the {relatedType}
and the <related-property-call-operator>
are determined from the <relation>
.
Examples:
-
Backward references
In the query:
flows?query.rql=rel(senders%3Fflow_id,eq(transport,urn%3Ax-nmos%3Atransport%3Artp))
for each
flows/{flowId}
, therel
call-operator is effectively equivalent to evaluating a sub-query:senders?query.rql=and(eq(flow_id,{flowId}),eq(transport,urn%3Ax-nmos%3Atransport%3Artp))
The result of that query is naturally an array which may contain zero or more senders. The result of the
rel
call-operator isfalse
if that array is empty,true
otherwise. -
Forward references
Similarly in the query:
senders?query.rql=rel(flow_id,eq(format,urn%3Ax-nmos%3Aformat%3Avideo))
for each
senders/{senderId}
, with"flow_id": "{flowId}"
, therel
call-operator is effectively equivalent to evaluating a sub-query:flows?query.rql=and(eq(id,{flowId}),eq(format,urn%3Ax-nmos%3Aformat%3Avideo))
The result of that query is naturally an array which may contain either exactly one flow or no flows. The result of the
rel
call-operator isfalse
if the array is empty,true
otherwise.For array-valued forwards references, like Device
senders
orreceivers
(both deprecated), and Source or Flowparents
, a sub-query would effectively be equivalent to thein
operator, so queries like this:-
devices?query.rql=rel(senders,<call-operator>)
-
sources?query.rql=rel(parents,<call-operator>)
would involve constructing sub-queries like so:
-
senders?query.rql=and(in(id,({...senders})),<call-operator>)
-
sources?query.rql=and(in(id,({...parents})),<call-operator>)
-