dgraph
dgraph copied to clipboard
Improve Facets mutation handling
Currently, if I write a facet like so <n1> <pred> <n2> (fromMerge:true) .
and then afterwards, another process "upserts" the same two nodes and edge like so <n1> <pred> <n2> .
, the fromMerge boolean facet is removed.
If I don't alter the facets in any way, would it make sense to not delete them? It would make managing edge facets easier on us, or at least allow an option to NOT overwrite the current facets.
This is currently an issue for us because when the fromMerge is written, there are specific things that only ever happen in that context that no other process that upserts items knows about. Thus I continue to lose that facet, which is used for specific and critical actions down the road.
Using 1.0.1 and the Golang Client.
I believe you should instantiate the data from Facet. For the idea behind the Facet is to have a "status" of that information. As an extra, but not as a immutable. So if you change it, the status should be changed too. Or in that case, deleted. if you want to keep or refresh the Facet you should control this in the application as if it were a pipe.
That's my two cents.
This is more complicated than I expected. My advice would be to fetch the facets first before upserting for now. Implementing this properly will take some time.
So, if we have round brackets, then merge. If no round brackets, delete facets.
Is this likely to be implemented base don @manishrjain 's suggestion above?
We'll definitely work on this, no exact timeline yet.
I think an "append" operation should solve the case. Perhaps in cases of KV overwriting we should have an alert preventing the mutation. Or maybe just overwrite it.
Who self-assign? @srfrog , @codexnull , @gitlw , @mangalaman93, @martinmr ?
Has there been any decision on this?
Some updates?????
Hi all,
Sorry for the delay! I'm trying to better understand this feature request. If I understood correctly when you insert the same edge multiple times you'd like to sometimes specify that the previous facets should be merged with the new ones.
What we have now
Currently, if we have a predicate met
between harry
and voldemort
which would have been created with the following mutation:
{
set {
_:harry <name> "Harry" .
_:voldemort <name> "Voldemort" .
_:harry <met> _:voldemort (inYear=2001) .
}
}
Currently, in order to add a new facet, e.g. likes=false
, we need to first fetch the previous facets attached to the predicate so we can then merge them on the client side and send them all back in a new mutation that would look something like the following:
{
set {
<0xb> <met> <0xc> (inYear=2001, likes=false) .
}
}
So if we now run the following query, we'll get both the inYear
and likes
facets:
{
q(func:eq(name, "Harry")) {
uid
name
met @facets {
uid
name
}
}
}
Which returns:
{
"extensions": {...},
"data": {
"q": [
{
"uid": "0xb",
"name": "Harry",
"met": [
{
"uid": "0xc",
"name": "Voldemort",
"met|inYear": 2001,
"met|likes": false
}
]
}
]
}
}
Proposal
Fetching all facets before adding a new one can be sometimes complicated, requires transactions, and seems to not be the easiest way to add one more facet to an existing predicate.
We would like to be able to provide syntax for the following cases:
- we want to remove all previous facets
- we want to replace all previous facets with a new set
- we want to keep the existing facets and add a new set
Currently, (1) is simply done by not passing any facets. Similarly, (2) is done by passing a list of facets in the mutation.
So we need to provide a new syntax for (3). Since the N-Triples specification does not provide any support for facets, I would recommend simply adding a +
right before the parentheses grouping the facets.
So given the previous dataset, where only the inYear
facet exists between Harry (0xb)
and Voldemort (0xc)
, we have four different cases:
-
<0xb> <met> <0xc> .
: the resulting predicate has no facets, as before. -
<0xb> <met> <0xc> (likes=false) .
: the resulting predicate has only thelikes
facet, as before. -
<0xb> <met> <0xc> +(likes=false) .
: the resulting predicate has bothinYear
andlikes
facets. -
<0xb> <met> <0xc> +() .
: the resulting predicate keeps the same facets as before, justinYear
.
And now, in JSON
For completion, the corresponding JSON mutation would look like the following:
{
"set": [
{
"name": "Harry",
"met": {
"name": "Voldemort",
"met|inYear": 2001
}
}
]
}
In order to add a new likes
facet, the mutation would now look like the following:
{
"set": [
{
"uid": "0xb",
"met": {
"uid": "0xc",
"met|likes": false
}
}
]
}
This mutation would act as <0xb> <met> <0xc> (likes=false) .
, removing the previous facets.
So how do we indicate that the previous facets should be kept in JSON format?
We have two options, the first one is simply not to support this use case in JSON. This might be a decent temporary solution but eventually, we'll have to fix it.
The first solution that comes to mind is to use the same syntax we have now for facets pred|facet
but combine it in a way that makes it so the change would be backwards compatible. Facet names can't contain the character @
, so let's use it to add a pseudo facet @keep
with a boolean value (set to false by default).
When this is set to true, the previous facets will be conserved similarly to the +()
syntax for RDF.
So in order to add a new likes
facet, we could write:
{
"set": [
{
"uid": "0xf",
"met": {
"uid": "0x10",
"met|likes": false,
"met|@keep": true
}
}
]
}
This would act exactly as <0xb> <met> <0xc> +(likes=false) .
, and the resulting predicate would have both likes
and inYear
facets.
What do you all think?
A few questions -
- Why not use
+
in json format as well? (That also leads to the side note, below) - What about Upsert on facets?
- What about technical challenges in doing this? I'm wondering this makes sense, why didn't we do this in the first place? Were there technical challenges that we faced?
Side Note - As far as I know, we do not have well defined rules for variables names, predicate names, facet names etc. We should materialize a definition and update the implementation to follow those rules.
RFC: https://discuss.dgraph.io/t/improve-facets-with-upsert-block/5640
I think that the upsert would be very useful.
e.g - Here we copy rating, something, and somethingElse and add 2 new values to the facet.
upsert {
query {
me(func: eq(email, "[email protected]")) {
v as uid
edge as someEdge @facets(r as rating, t as something, e as somethingElse)
}
}
mutation {
set {
uid(v) <someEdge> uid(edge) (weight=0.1, updated_at=2019-06-02T13:01:09, val(r), val(t), val(e)) .
}
}
}
We could have a "Spread operator" (Recurse) to save time. In some cases that there are many values in facets. e.g
upsert {
query {
me(func: eq(email, "[email protected]")) {
v as uid
edge as someEdge @facets(r as ...)
}
}
mutation {
set {
uid(v) <someEdge> uid(edge) (weight=0.1, updated_at=2019-06-02T13:01:09, val(r)) .
}
}
}
The three dots would be the operator (r as ...)
. It will copy and paste the previous values together with the new values.
Github issues have been deprecated. This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.
![drawing](https://github.com/dgraph-io/benchmarks/blob/master/images/SetNotifications.png?raw=true)
It's been 2 years now, this feature still has no progress?
It has been 3.5 years, still no update? Without this feature, facets, which are key to graph database, is pretty much useless.
I think this should be reopened. This functionality is critical and expected, it should be given greater prominence as a caveat/missing feature in the documentation.
It basically means Dgraph cannot do update-in-place, I'm now doing an expensive query for every update which has a massive impact on the cost of using Dgraph cloud.
Without it, dgraph will never be production available
To my surprise, this question had been asked four years before. I also encountered this tricky problem when using Dgraph: in many scenarios, the edge properties are not fixed and the information needs to be updated. When edge properties are updated, the original information is lost or takes an unexpected action. So I think about storing the Edge properties in the Node, which also makes the node bloated (storing information that should belong to the edge) . This also makes Edge/Facet lose its greatest value if it can only record fixed data once.