dgraph Improve Facets mutation handling

Improve Facets mutation handling

Open willcj33 opened this issue 6 years ago • 17 comments

Currently, if I write a facet like so <n1> <pred> <n2> (fromMerge:true) . and then afterwards, another process "upserts" the same two nodes and edge like so <n1> <pred> <n2> ., the fromMerge boolean facet is removed.

If I don't alter the facets in any way, would it make sense to not delete them? It would make managing edge facets easier on us, or at least allow an option to NOT overwrite the current facets.

This is currently an issue for us because when the fromMerge is written, there are specific things that only ever happen in that context that no other process that upserts items knows about. Thus I continue to lose that facet, which is used for specific and critical actions down the road.

Using 1.0.1 and the Golang Client.

Jan 10 '18 03:01 willcj33

I believe you should instantiate the data from Facet. For the idea behind the Facet is to have a "status" of that information. As an extra, but not as a immutable. So if you change it, the status should be changed too. Or in that case, deleted. if you want to keep or refresh the Facet you should control this in the application as if it were a pipe.

That's my two cents.

Jan 11 '18 14:01 MichelDiz

This is more complicated than I expected. My advice would be to fetch the facets first before upserting for now. Implementing this properly will take some time.

Jan 18 '18 00:01 pawanrawal

So, if we have round brackets, then merge. If no round brackets, delete facets.

Mar 16 '18 04:03 manishrjain

Is this likely to be implemented base don @manishrjain 's suggestion above?

Mar 23 '18 13:03 jimanvlad

We'll definitely work on this, no exact timeline yet.

Mar 25 '18 23:03 manishrjain

I think an "append" operation should solve the case. Perhaps in cases of KV overwriting we should have an alert preventing the mutation. Or maybe just overwrite it.

Who self-assign? @srfrog , @codexnull , @gitlw , @mangalaman93, @martinmr ?

Feb 25 '19 19:02 MichelDiz

Has there been any decision on this?

Mar 11 '19 14:03 AlexanderCHarrington

Some updates?????

Apr 25 '19 22:04 lelvisl

Hi all,

Sorry for the delay! I'm trying to better understand this feature request. If I understood correctly when you insert the same edge multiple times you'd like to sometimes specify that the previous facets should be merged with the new ones.

What we have now

Currently, if we have a predicate met between harry and voldemort which would have been created with the following mutation:

{
    set {
        _:harry <name> "Harry" .
        _:voldemort <name> "Voldemort" .
        _:harry <met> _:voldemort (inYear=2001) .
    }
}

Currently, in order to add a new facet, e.g. likes=false, we need to first fetch the previous facets attached to the predicate so we can then merge them on the client side and send them all back in a new mutation that would look something like the following:

{
    set {
        <0xb> <met> <0xc> (inYear=2001, likes=false) .
    }
}

So if we now run the following query, we'll get both the inYear and likes facets:

{
  q(func:eq(name, "Harry")) {
    uid
    name
    met @facets {
      uid
      name
    }
  }
}

Which returns:

{
  "extensions": {...},
  "data": {
    "q": [
      {
        "uid": "0xb",
        "name": "Harry",
        "met": [
          {
            "uid": "0xc",
            "name": "Voldemort",
            "met|inYear": 2001,
            "met|likes": false
          }
        ]
      }
    ]
  }
}

Proposal

Fetching all facets before adding a new one can be sometimes complicated, requires transactions, and seems to not be the easiest way to add one more facet to an existing predicate.

We would like to be able to provide syntax for the following cases:

we want to remove all previous facets
we want to replace all previous facets with a new set
we want to keep the existing facets and add a new set

Currently, (1) is simply done by not passing any facets. Similarly, (2) is done by passing a list of facets in the mutation.

So we need to provide a new syntax for (3). Since the N-Triples specification does not provide any support for facets, I would recommend simply adding a + right before the parentheses grouping the facets.

So given the previous dataset, where only the inYear facet exists between Harry (0xb) and Voldemort (0xc), we have four different cases:

<0xb> <met> <0xc> .: the resulting predicate has no facets, as before.
<0xb> <met> <0xc> (likes=false) .: the resulting predicate has only the likes facet, as before.
<0xb> <met> <0xc> +(likes=false) .: the resulting predicate has both inYear and likes facets.
<0xb> <met> <0xc> +() .: the resulting predicate keeps the same facets as before, just inYear.

And now, in JSON

For completion, the corresponding JSON mutation would look like the following:

{
    "set": [
      {
        "name": "Harry",
        "met": {
          "name": "Voldemort",
          "met|inYear": 2001
        }
      }
    ]
}

In order to add a new likes facet, the mutation would now look like the following:

{
    "set": [
      {
        "uid": "0xb",
        "met": {
          "uid": "0xc",
          "met|likes": false
        }
      }
    ]
}

This mutation would act as <0xb> <met> <0xc> (likes=false) ., removing the previous facets. So how do we indicate that the previous facets should be kept in JSON format?

We have two options, the first one is simply not to support this use case in JSON. This might be a decent temporary solution but eventually, we'll have to fix it.

The first solution that comes to mind is to use the same syntax we have now for facets pred|facet but combine it in a way that makes it so the change would be backwards compatible. Facet names can't contain the character @, so let's use it to add a pseudo facet @keep with a boolean value (set to false by default).

When this is set to true, the previous facets will be conserved similarly to the +() syntax for RDF.

So in order to add a new likes facet, we could write:

{
    "set": [
      {
        "uid": "0xf",
        "met": {
          "uid": "0x10",
          "met|likes": false,
          "met|@keep": true
        }
      }
    ]
}

This would act exactly as <0xb> <met> <0xc> +(likes=false) ., and the resulting predicate would have both likes and inYear facets.

What do you all think?

Jul 10 '19 01:07 campoy

A few questions -

Why not use + in json format as well? (That also leads to the side note, below)
What about Upsert on facets?
What about technical challenges in doing this? I'm wondering this makes sense, why didn't we do this in the first place? Were there technical challenges that we faced?

Side Note - As far as I know, we do not have well defined rules for variables names, predicate names, facet names etc. We should materialize a definition and update the implementation to follow those rules.

Jul 12 '19 09:07 mangalaman93

RFC: https://discuss.dgraph.io/t/improve-facets-with-upsert-block/5640

I think that the upsert would be very useful.

e.g - Here we copy rating, something, and somethingElse and add 2 new values to the facet.

upsert {
      query {
        me(func: eq(email, "[email protected]")) {
          v as uid
          edge as someEdge @facets(r as rating, t as something, e as somethingElse)
        }
      }
      mutation {
          set {
            uid(v) <someEdge> uid(edge) (weight=0.1, updated_at=2019-06-02T13:01:09, val(r), val(t), val(e)) .
          }
      }
    }

We could have a "Spread operator" (Recurse) to save time. In some cases that there are many values in facets. e.g

upsert {
      query {
        me(func: eq(email, "[email protected]")) {
          v as uid
          edge as someEdge @facets(r as ...)
        }
      }
      mutation {
          set {
            uid(v) <someEdge> uid(edge) (weight=0.1, updated_at=2019-06-02T13:01:09, val(r)) .
          }
      }
    }

The three dots would be the operator (r as ...). It will copy and paste the previous values together with the new values.

Jul 12 '19 13:07 MichelDiz

Github issues have been deprecated. This issue has been moved to discuss. You can follow the conversation there and also subscribe to updates by changing your notification preferences.

Jul 20 '20 18:07 minhaj-shakeel

It's been 2 years now, this feature still has no progress?

Aug 07 '20 10:08 BigMurry

It has been 3.5 years, still no update? Without this feature, facets, which are key to graph database, is pretty much useless.

Sep 27 '21 12:09 arpanbag001

I think this should be reopened. This functionality is critical and expected, it should be given greater prominence as a caveat/missing feature in the documentation.

It basically means Dgraph cannot do update-in-place, I'm now doing an expensive query for every update which has a massive impact on the cost of using Dgraph cloud.

Feb 24 '22 15:02 homerjam

Without it, dgraph will never be production available

Sep 08 '22 17:09 newcworld

To my surprise, this question had been asked four years before. I also encountered this tricky problem when using Dgraph: in many scenarios, the edge properties are not fixed and the information needs to be updated. When edge properties are updated, the original information is lost or takes an unexpected action. So I think about storing the Edge properties in the Node, which also makes the node bloated (storing information that should belong to the edge) . This also makes Edge/Facet lose its greatest value if it can only record fixed data once.

Sep 15 '22 12:09 lizonglingo

dgraph dgraph copied to clipboard

Improve Facets mutation handling

What we have now

Proposal

And now, in JSON

dgraph
dgraph copied to clipboard