graphql icon indicating copy to clipboard operation
graphql copied to clipboard

Feature Request: Ability to query with distinct values

Open morningcloud opened this issue 3 years ago • 4 comments

I have a use case where we need to return only unique values of the query fields selected in the graphql query. This would typically be represented with MATCH (n:NodeType) RETURN **DISTINCT** n.field1, n.field2,... in cypher. At the moment I would have to write custom cypher in order to return distinct values which is not flexible or write a custom resolver to dynamically build the return fields based on query selection available under context.resolveTree.fieldsByTypeName.NodeType. With this approach I am losing on the flexible filtering power provided by the augmented schema generated by neo4j graphql library.

Having a boolean attribute under the augmented query options with name as distinct in addition to the existing sort, limit, offset attributes. If this attribute is set to true add the keyword DISTINCT after the RETURN clause of the translated cypher query.

The main use case for this would be when the query is used to get look up list of values based on node's properties when these properties are not unique across nodes. For example, we have a node Entity that contains two properties (entity name, entity type) generated when running a named entity recognition on text data, the same entity name can have different types. For some queries we would need to get the list of unique entity names regardless of their types and in some cases we would need to get both properties.

morningcloud avatar Jan 12 '22 06:01 morningcloud

Hey @morningcloud, sorry for the slow reply on this one, we were super bogged down with 3.0.0.

From #800 you gave the example:

query {
   movies(options: { distinct: true }) {
      title
   }
}

This would return movies with a unique title. However, we've like to propose a slight extension on the above, where choosing which properties are included in the DISTINCT statement are moved into a argument:

query {
   movies(options: { distinctOn: ["title"] }) {
      title
      runtime
   }
}

This would allow users to separate which fields they would like to be unique from which fields are returned from the query, and would roughly translate into the following Cypher:

MATCH (this:Movie)
WITH DISTINCT this { .title } AS distinctMovies
RETURN distinctMovies { .title, .runtime } AS this

What do you think about this proposal?

darrellwarde avatar Feb 22 '22 16:02 darrellwarde

Hi, is there any update whether this is making it in an upcoming release? We are very interested in this feature and would also be able to help if there is anything missing.

kthr avatar Sep 13 '23 12:09 kthr

The ability to retrieve unique values seems to me to be a pretty basic functionality and in fact I'm amazed that in such a relatively grown-up project there is still no such possibility, whereas there is the possibility to retrieve the longest and shortest string.

As far as I can see the original Pull Request is already closed so is unlikely to be completed, and the feature with 'distinctValues' is in the wishlist. Is anyone who is running this project able to determine if this feature will ever be available?

lucassith avatar Oct 23 '23 09:10 lucassith

Random duplicity issues are quite annoying in our case. Adding this feature will really make things simpler.

oberoivarun avatar Feb 17 '25 19:02 oberoivarun