evitaDB icon indicating copy to clipboard operation
evitaDB copied to clipboard

Provide means for accessing reference count without actually fetching the entities

Open novoj opened this issue 1 year ago • 1 comments

After recent discussion with FE team using evitaDB, there may be use case were we want to fetch entity that has existing referenced entities of some type. On top of that however, we want to fetch count of these referenced entities without actually fetching the entities. Something like that:

{
  listGroup(
    filterBy: {
      attributeCodeEquals: "news-group"
      referenceProductsHaving: {}
    }
  ) {
    productsCount # generated from reference `products`
  }
}

I think it could be solved also with facet summary, probably like this:

{
  queryGroup(
    filterBy: {
      attributeCodeEquals: "news-group"
      referenceProductsHaving: {}
    }
  ) {
    extraResults {
      facetSummary {
        products {
          count
        }
      }
    }
  }
}

But that's quite cumbersome to use on FE. @novoj do you think it would be valid to support the first approach at the GraphQL API level? We could also reuse the filterBy clause from the reference fields. On the backend it could be translated to basic referenceContent.


I think we could automatically provide the count attribute on EntityDecorator level. evitaDB internally always needs to compute reference primary keys to calculate the count, so we could always provide the array of these primary keys (since they have to be fetched anyway). But on the API level the primary keys can easily be thrown away and only the count could be provided to the external clients, thus saving some transport/networking costs.

novoj avatar Aug 20 '24 06:08 novoj

This feature request partially opens up a path to a larger extension of the reference fetching enhancements. It would be also beneficial to be able to fetch only chunks of references. After discussion with @lukashornych we'd like to:

  1. be able to avoid sending all the referenced PKs over the wire in all of the protocols - this could be easily achieved by requesting zero records in the output using (page(1,0) or strip(0,0))
  2. be able to paginate the fetched references
  3. combine both above requirements with ordering and filtering constraints

In order to do so, this language extension makes sense to us:

// returns only counts for product reference (no PKs or bodies are sent)
referenceCount('products')
// returns only counts for all entity references (no PKs or bodies are sent)
referenceCount()
// returns only counts for product reference that match filtering criteria in a custom order (no PKs or bodies are sent)
referenceCount('products', filterBy(...), orderBy(...))
// returns first page of 20 primary keys that match filtering criteria in a custom order
referenceContent('products', filterBy(...), orderBy(...), page(1, 20))
// returns first page of 20 referenced entities that match filtering criteria in a custom order
referenceContent('products', filterBy(...), orderBy(...), entityFetchAll(), page(1, 20))

We still need to analyze how this proposal affects the internal structure of EntityDecorator and the connected APIs. This change would also affect all the APIs at once and would cover all our requirements.

novoj avatar Aug 21 '24 07:08 novoj

This issue will focus solely on referenced content. Pagination support in facet summary was moved to separate issue: https://github.com/FgForrest/evitaDB/issues/812

novoj avatar Feb 17 '25 07:02 novoj

Server side implementation except API integration done. @lukashornych please see usage patterns in tests:

  • io.evitadb.api.EntityFetchingFunctionalTest#shouldPaginateReferences
  • io.evitadb.api.EntityFetchingFunctionalTest#shouldStrippedReferences
  • io.evitadb.api.EntityFetchingFunctionalTest#shouldCombinePaginatedAndStrippedReferences

Client (gRPC/Java) will follow next.

novoj avatar Feb 18 '25 16:02 novoj

  • [x] add support for spacing inside pagination in reference content

novoj avatar Feb 27 '25 12:02 novoj

@novoj When using referenceContent(xxx, page(1, 0)) to fetch only number of references, lastPageNumber returns what seems to be a Integer.MAX_VALUE, which I think is not correct. I've found it by calling the documentation/user/en/query/requirements/examples/fetching/referenceContentPageEmpty.rest example.

lukashornych avatar Feb 27 '25 14:02 lukashornych

@novoj Shouldn't the lastPageNumber be 1 in this case?

Image

lukashornych avatar Feb 28 '25 12:02 lukashornych

Why do you think so? When pageSize is zero, how many pages would you have for 17 records? Infinite. But we cannot pass infinite value, so better is zero or -1 (as non-possible value). I chose zero. But let's discuss.

novoj avatar Feb 28 '25 12:02 novoj

Yeah, you are right, that does make sense. I was coming from the pageNumber: 1, but that's from a request, so it makes sense it is 1 and not 0.

lukashornych avatar Feb 28 '25 12:02 lukashornych

@novoj GQL and REST API are completely done (API logic, int. tests, user docs).

But I think you will need to implement some printing logic for referenceContentPageEmpty.evitaql.json.md, right now, it doesn't print references at all.

lukashornych avatar Feb 28 '25 13:02 lukashornych

Spacing support finalized, however there are still some tests failing.

novoj avatar Feb 28 '25 17:02 novoj

@novoj All tests seems to be working now.

lukashornych avatar Mar 06 '25 13:03 lukashornych