typedb
typedb copied to clipboard
Add a TypeQL output format for queries
Problem to Solve
Currently, these is no way to output the results of a query into TypeQL, for instance:
match
$p isa person, has $a;
$p_1 isa person;
$a_1 isa name;
$a_1 = "Kevin";
$p_1 has $a_1;
$a_2 isa phone;
$a_2 = "+44 (0)7123 456789";
$p_1 has $a_2;
This is extremely useful for a number of applications and is very difficult to work around, requiring quite a bit of code to de-duplicate the results and reconstruct the TypeQL. Particularly, it could save a lot of time during debugging and development. It can be used for:
- Building graph visualisers and similar.
- Easily recreating minimal reproducible cases for bug reports.
- Easily building small datasets for demos and examples.
Current Workaround
None.
Proposed Solution
Implement this new output format alongside JSON, etc.
Fortunately, I think the upcoming fetch {}
will address all of this. But I leave that all up to @flyingsilverfin.
This is very much not the case. The two have completely different applications and I wouldn't have opened this issue if I thought this would be addressed by fetch. The fetch output will still lack a lot of the context required to rebuild data from the result that can currently only be obtained from the query, requiring the query and result to be parsed together to rebuild the full structure of the data.
Having looked into this further, it is actually impossible to generate TypeQL responses on the client side in the general case, even with the context of both the query and the response. This is because type-inference leads to important information required to recreate the pattern never leaving the server.
Is what you're proposing similar to the idea of tieing bindings into a TypeQL query: TypeQLQuery.withBounds(conceptMap)
?
This would return the original query with IIDs and attribute values and types bound in.
What's the data you'd need from the server that makes it impossible to generate TypeQL on the client side?
The missing info is because of inference of roles. The concept map does not contain edge information. While owns
edges can be reconstructed from the query pattern, because the type of a role can be inferred, neither the query pattern nor the concept map contains the information required to correctly rebuild roles without concept API calls.
I recently implemented this output format in my fork of TypeDB Jupyter and have been using it to extract data from TypeDB Bio for reconstruction of minimum required data necessary to run demos. Myself and @krishnangovindraj also ran into an optimisation issue in the process, and I was easily able to extract the minimum data for reproduction, see (2) under reproducible steps.
Are you trying to convert the matched answers into insert or specific match queries?
I kind of see what you're getting at. I think a graph return format could basically look like a TypeQL query now that you mention it
Exactly what I'm trying to do, and yes this contains all the info necessary to build a graph with no further calls. They look identical because a graph is just a visual representation of a pattern, i.e. parity between conceptual and physical models.
The biggest issue with this at the moment is that if I perform a query involving inferred results, the generated TypeQL output will not reflect this, so if I reinsert it then I keep the end result but lose the ability to explain it. The proof model I wrote a few months ago also functions with TypeQL output, so I anticipate I should be able to solve this issue by integrating it in but don't have the time at the moment.
To add a different perspective here: The 'complete' answer to a TypeQL query is the mapping of every variable to the concept that it is bound to. This plays well with the logical semantics, since the answer is a 'satisfying substitution' to the set of (existentially quantified) variables.
We do have a hint of the solution in the way TypeDB internally treats labels in a query - as variables. Returning the concrete labels (that were unified with the labels in the query) as part of the answers seems to be the 'correct' solution as far as the logical view of the semantics is concerned.
@james-whiteside : This suggests the easy workaround of using a variable in place of every label, and adding a $label_var sub query_label;
as part of the query. For example, the statement:
$r (my-role: $t) isa my-relation;
becomes:
$r ($my-role: $t) isa $my-relation; $my-role sub my-role; $my-relation sub my-relation;
(If doing this programatically, care should be taken so different occurrences of the same label are replaced by different variables)