typedb icon indicating copy to clipboard operation
typedb copied to clipboard

Allow retrieving attributes optionally, without filtering the query

Open maydanw opened this issue 3 years ago • 17 comments

Problem to Solve

In my code objects/entities have many optional attributes and when I try to fetch them I need to fetch all of the attributes from the server (adding them directly to the query will filter the entity if they are missing) and then use only a few which is very inefficient (e.g., A user have many attributes which are optional, but in my code, I have MinimizedUser object that has only a few details that can be shared. Fetching the whole object seems like a waste but it the only way when some of the attributes may not exist).

Current Workaround

@flyingsilverfin suggested

match $x iid 0x123; $x has $a; { $a isa [attr-1]; } or { $a isa [attr-2]; } or ...;

It is a valid solution but it creates huge queries that can only be produced by a machine.

Proposed Solution

I think the ideal solution is to separate by somewhat the MATCH and the GET parts.
e.g.,

match $x isa person, has email "[email protected]";
get $x.username, $x.email, $x.fullname;

where a dot signifies an attribute (this can be extended further but I don't think it is required). This allows me to write short and concise queries which filter only what I want and send back only what I need.

A less optimal solution is to add something like optionallyHas/mayHave keyword, which is none-filterable therefore a query will look something like:

match $x isa person, has email "[email protected]", has username $username, mayHave fullname $fullname;
get $username, $fullname;

maydanw avatar May 11 '21 19:05 maydanw

We need this as well. At the moment we're matching on the entity e.g., get $x (the person) and then using get_has in Python, which seems unnecessary. My preferred solution is the first one proposed, being able to get optional properties without constraining the query.

thomaschristopherking avatar May 12 '21 07:05 thomaschristopherking

Yes, I see this is really valuable. I'll triage this to add the roadmap.

haikalpribadi avatar May 25 '21 18:05 haikalpribadi

I've also updated the issue title to be a bit clearer

haikalpribadi avatar May 25 '21 18:05 haikalpribadi

Can this be extended to optional roles?

Consider I may have a relationship where two roles are required, and two are optional. Some data records have four roles, some three, and some two. How then to retrieve all records? e.g. match $a (required1: $req1, required2: $req2, optional1: $opt1, optional2: $opt2) isa sparse_relation

In the scenario listed above, this match will only retrieve a single record. How can I retrieve all relations, with their data, regardless of how many optional roles are filled?

dynamic-modeller avatar Oct 19 '21 23:10 dynamic-modeller

Can this be extended to optional roles?

Consider I may have a relationship where two roles are required, and two are optional. Some data records have four roles, some three, and some two. How then to retrieve all records? e.g. match $a (required1: $req1, required2: $req2, optional1: $opt1, optional2: $opt2) isa sparse_relation

In the scenario listed above, this match will only retrieve a single record. How can I retrieve all relations, with their data, regardless of how many optional roles are filled?

I believe by the same principle: uncouple the MATCH and the GET.
It may look something like:

...
match $a (required1: $req1, required2: $req2) isa sparse_relation;
get $a.required1, $a.required2, $a.optional1,  $a.optional2;

Where the result will always return required1 and required2 and may return optional1, optional2

It is always possible to take it much further and have something like the following where I used the . (dot) for attribute and -> (arrow) for roles to "step to" and also chained it a bit

match $a (required1: $req1, required2: $req2) isa sparse_relation;
$req1 has username;
$req2 has username;
get $a->required1.username, $a->required2.username, $a->optional1.username,  $a->optional2.location_name;

In this conceptual example, you will always get together two usernames and sometimes another username and a location name.

At this stage, this is only a feature request to it can contain various other ideas of other ways to reach the same goal.

maydanw avatar Oct 20 '21 15:10 maydanw

High level design plan (implementation date: not decided) - We introduce a new optional { } clause (similar to not or or) that will possibly populate the answer field ConceptMap.optional, which is a map in itself.

Example:

match $p isa person, has email "[email protected]"; optional { $p has attr-1 $a1; }; optional { $p has attr-2 $a2; }; 
=>
{ $p : 0x123, optional { $a1: 0x234, $a2: 0x345}}
{ $p : 0x222, optional {}}
...

You can have multiple optional { } blocks, each of which can internally contain an arbitrary query again.

Open questions:

  • what are the semantics of nested optionals?
  • What about optionals in negations or disjunctions?
  • To be consistent, variables written in disjunctions that are not currently returned should also return in an optional block? Eg.
match $p isa person, has email "@[email protected]"; { $p has name $n; } or { $p has age $a; };

currently returns neither $n nor $a because these are not scoped in the top level query. However, in optional blocks we actually do return the variables not scoped in the top level!

=> this may precipitate a redesign of a ConceptMap into different components. We may try to end up being able to have explainable answers at all times -- currently when explain() is enabled, we ignore the get filter and return all variables, including anonymous ones. This could for example be part of ConceptMap.Explainable.completeAnswer, and we can continue respecting the get filter for the map in the main part of the answer.

Other effects: Defaults can be implemented in the application layer by checking whether a field is present in the ConceptMap.optional.

flyingsilverfin avatar Nov 02 '21 14:11 flyingsilverfin

Will this work for a situation like this one?

  • An entity with 1000 attributes
  • I want to query all the instances of the entity and 100 of those attributes, where some of them have it some don't.

The has? could achieve that. I also like the get $a->required1.username (@maydanw not sure why you add the required part 🤔) but I must admit i don't understand the optional {}

With @maydanw's proposal has would be ALWAYS a constraint, and the get notation would just grab extra attributes (required or not). They change absolutely nothing in the constraints, so i don't see the point of adding them in the constraints...

I guess adding them in the constraints open the door to use them in other lines, but that would be an extra step... And probably way more complicated to code than just "ok, now that you got the result, just grab the values of these X attributes"

lveillard avatar Sep 08 '22 08:09 lveillard

So just another extra idea of a different way by adding an extra type of query 'fetch'

Examples:

example match with optional attributes:

match $e isa employee; #constraints
fetch $e has name $name, has email $email;

example deletion where we want to delete attributes if they are there:

match $e isa employee;  #constraints
fetch $e has name $name, has email $email;
delete $e has $name, has $email

This would be another way to treat constraints and fetches as different things

lveillard avatar Nov 14 '22 13:11 lveillard

We could also extend optionals to include them in insert or delete blocks:

match 
  $p ... <conditions...: optional { $p has name $a1; }; optional { $r1 ($p, $q) isa rel-type;}; optional { $new-attr "value" isa attribute; }
delete 
  optional { $p has $a1; }; optional { $r1 isa relation; }
insert
  optional { $p has $new-attr; }

In inserts, the optionals should execute if all required input variables are defined. Match optionals should return a value for each variable if they find matching data.

flyingsilverfin avatar Jan 23 '23 10:01 flyingsilverfin

We could also extend optionals to include them in insert or delete blocks:

match 
  $p ... <conditions>...: optional { $p has name $a1; }; optional { $r1 ($p, $q) isa rel-type;}; optional { $new-attr "value" isa attribute; }
delete 
  optional { $p has $a1; }; optional { $r1 isa relation; }
insert
  optional { $p has $new-attr; }

In inserts, the optionals should execute if all required input variables are defined. Match optionals should return a value for each variable if they find matching data.

If I understood correctly, this will allow sending updates without the match query beforehand which can be beneficial yet somewhat dangerous. In such a case the returned value should be very clear what indeed updated.

maydanw avatar Jan 23 '23 13:01 maydanw

We could also extend optionals to include them in insert or delete blocks:

An alternative to insert the optionals inside match / delete / insert would be to do it the other way around, which opens a lot of possibilities:

Example 1:

Unlink all players from two optional roles, and replace one of those if it has anything linked to it. As an extra, delete an attribute that is also optional and will not break the stream if it is not there:

match 
  $r ($roleType:$player) isa process, has id in [1,2];
 
#unlink in one single query, all players of two optional roles in a relation
optional {
  match
    $roleType type process:interviews;
  delete
    $r (interview: $player)
}

optional {
  match 
     $roleType type process:candidate;
  delete
    $r ( candidate: $player);
}
# Second role is actually replaced (or adding a first one if there is none linked)
insert 
  $newCandidate isa Person, has id 'Ann';
  $r ( candidate: $newCandidate);

# also $r has an optional attribute that will be deleted if it is there
optional {
  match
    $r has optionalAttribute $att;
  delete
    $r has $att
}

Example 2:

A second example where optional is used from the root and where variables are shared even between siblings:

#parelel optionals in the root, with shared variables
optional {
   match 
      $p1 isa process has id 1;
   insert
      $j1 isa job, has id 'j1';
      $c isa candidate, has name 'Bob';
 $p1 (job: $j1, candidate:$c);
}
optional {
   match 
      $p2 isa process has id 2;
   insert
      $j2 isa job, has id 'j2';
 # As $c is defined in a sibling, it might be empty. No problem, if it is the case, only $j2 is linked.
     $p2 (job: $j2, candidate:$c);
}

This would help to create independent queries (so the match clauses of one thread does not break all the inserts/deletions) while keeping a way to use the variables in multiple places

lveillard avatar Apr 20 '23 09:04 lveillard

Note: with TypeQL Fetch queries implemented in 2.25.x (https://github.com/vaticle/typedb/pull/6888), many optionality issues will be resolved, although not necessarily all.

flyingsilverfin avatar Nov 02 '23 10:11 flyingsilverfin

The issue with the Fetch model is it destroys the value of the classing system, since every object and variable must be named explicitly so it can then be mapped to json in the appropriate clause.

So if I have 10 variations of a single object, each with approximately 100 lines of typeql import, I require 500-1,000 lines of typeql to explicitly name every single variable, every single sub-object , for every single variation and then provide its json mapping.

Alternatively I can use 10- lines of typeql to match/get the concept using the power of the TypeQL class system. The problem at the moment is that I must know the structure of the object in question up-front in order to understand exactly which optional clauses to include.

Enabling optionals would empower interactive TypeQL which essentially becomes useless with the Fetch->JSON idea, as the clause lengths are far too long to input manually, and are insanely fragile (i.e. one wrong line, or one change, and the whole thing will collapse)

Fetch is only useful with simple objects where the typeql name is the same as the json export name, thus reducing the amount of boilerplate.

Optionals are essential

brettforbes avatar Nov 21 '23 01:11 brettforbes

We decided to test out typedb as a possible replacement of our existing DB and almost immediately ran into this issue within the first hour of playing around.

We first hit this issue when trying a basic query to pull all nodes of a certain type and their attributes (some of which were optional), which we easily worked around using what I assume is the equivalent to SQL's * operator, $node has attribute $attr; (if this is not the typical pattern please correct me!)

However, we ran into this issue again while testing out schema changes, and this one did confuse us a bit as the website and docs advertise how seamless they are. We extended a schema with a new field that we need in certain cases, for certain objects of the type. When we try to include this attribute in query results, we only receive objects that have this attribute set. Unlike a typical SQL database which essentially implicitly assigns the attribute with a null or default value to each item, it seems like we would have to do this ourselves in typedb and manually update each object in the database with an empty value for the attribute - or, of course, write multiple queries and join the results, or possibly some more complicated query that can do it all in one shot. The website/docs advertise that schema changes don't require rewriting of queries, which is true - but having to either write multiple queries, one unnecessarily-complex query, or mass manual database writes when writing your new queries that make use of your schema changes seems like the hidden cost of the advertised benefit, which may or may not outweigh it.

Again, we are very new here so please let me know if we're doing anything silly or seem to be misunderstanding something. Either way, I do hope some form of implementation of ideas here is implemented - and even more so if the issue I described above is indeed accurate/sane.

pb376 avatar Jan 06 '24 07:01 pb376

hey @pb376 its hard to make a comment without seeing your schema, but @flyingsilverfin has shown that a fetch statement can be constructed to return optional parts of objects

So the function of returning all of the required data, even if one is not sure what to put in the query., is solved mostly by the Fetch query

brettforbes avatar Jan 06 '24 08:01 brettforbes

@pb376 your expectations of what TypeDB can do based on what's advertised on the website is accurate. However I think you are using the wrong TypeQL query. match is a "pattern matching" clause and get retrieves everything you "match". If you want to optionally fetch things related to the pattern you matched, such as select * from SQL, you should the match-fetch query.

We will review our docs to make sure this will be clear going forward.

Can you test your typedb application with the fetch query and see if it's what you expected?

haikalpribadi avatar Jan 07 '24 13:01 haikalpribadi

Also @pb376 can you tell us which guides you followed in the docs initially?

haikalpribadi avatar Jan 07 '24 13:01 haikalpribadi