alfresco-community-repo icon indicating copy to clipboard operation
alfresco-community-repo copied to clipboard

MNT-20791 : Smart Folder incorrect page number and item number

Open Epurashu opened this issue 3 years ago • 3 comments

PagingRequestConstraint now is aware of pagingRequest.requestTotalCountMax. Rename start to skipCount in VirtualQueryImpl#asPagingResults Added skipCount maxItems and requestTotalCountMax tests in VirtualQueryImplTest For consistency in the share pagination (normal folder vs smart folder) the following changes were done: When result.hasMore() is true inVirtualQueryImpl will now return a pair of identical values for totalItems instead of (value,value+1) When skipCount > 0 and requestTotalCountMax > 0 , searchParameter.maxItems will be requestTotalCountMax - skipCount because the skipped elements will not be part of the query ResultSet

Epurashu avatar Apr 16 '21 15:04 Epurashu

More details about the code changes changes:

For consistency in the share pagination (normal folder vs smart folder) the following changes were done:

  • When result.hasMore() is true inVirtualQueryImpl will now return a pair of identical values for totalItems instead of (value,value+1)

e.g. we have more than 1000 elements and we are on page 1. expected: at the bottom of the Share page we can see elements 0-50 and total elements will be 1000++ actual: In case the totalResultCountLower is different than totalResultCountUpper, at the bottom of the page we can see elements 0-50 and total elements will be 1001 (or the value of totalResultCountUpper)

  • When skipCount > 0 and requestTotalCountMax > 0 , searchParameter.maxItems will be requestTotalCountMax - skipCount because the skipped elements will not be part of the query ResultSet

The pagination request received from Share is usually of form: skipCount: 50 * currentPageNumber maxItems: 50 requestTotalCountMax: 1000 + skipCount

When we list a standard folder, we create a CannedQuery that will bring requestTotalCountMax number of items and then into the java code we figure out the pagination (return a single page of 50 elements starting with elements from skipCount +1)

Because Smart Folders uses SearchService for the query, the service will actually return a number of elements defined by maxItems starting with the element at position skipCount +1, because of this, if we want the query to always return 1000 elements, in case requestTotalCountMax is present, we need to substract the skipCount value from it. This also will improve performance as for e.g. if skipCount value is high, the query will retrieve skipCount + 1000 values.

Epurashu avatar Apr 16 '21 15:04 Epurashu

...because of this, if we want the query to always return 1000 elements...

Why do we want the query to return 1000 elements ?

skopf avatar Apr 16 '21 16:04 skopf

@skopf thank you for your review. MNT-20791 presents a problem in our current implementation of smart folders:

The current behavior when we list a folder in Share (same problem reported in ADW), at the bottom of the page the element displaying the page numbering and items is:

For this example let's say the folder has 2000 children: Normal Folder:

  • Page 1 : 1 -50 of 1000++ << 1 2 3 ... 20 >>
  • Page 5 : 250-300 of 1250++ << 5 6 7 ... 24 >>

Smart Folder:

  • Page 1 : 1-50 of 51 << 1 2 >>
  • Page 5 : 250-300 of 551 << 5 6 7 ... 11 >>

For this example let's say the folder has 501 children: Normal Folder:

  • Page 1 : 1 -50 of 501 << 1 2 3 ... 9 >>
  • Page 5 : 250-300 of 501 << 5 6 7 ... 9 >>

Smart Folder:

  • Page 1 : 1-51 of 51 << 1 2 >>
  • Page 5 : 250-300 of 551 << 5 6 7 ... 11 >>

The main problem is that in case of Smart Folders the number of "available pages" and "total elements" is displayed wrong if there are more pages (hasMore()). The wrong formula for maximum elements is the following : pageSize + 2 * skipCount + 1 half of the problem is in the following code: https://github.com/Alfresco/alfresco-community-repo/blob/ef441fc2c80fea43e70bbfa7171448729c8cb6cb/repository/src/main/java/org/alfresco/repo/virtual/template/VirtualQueryImpl.java#L186-L207

         try
        {
            start = result.getStart();   --- Javadoc: Get the start point for this results set in the overall set of rows that match the query - this will be equal to the skip count set when executing the query, and zero if this is not set.
        }
        ...
        final int totlaSecond = !hasMore ? (int) result.getNumberFound() : (int) (start + result.getNumberFound() + 1);

Why does this happen ?

The first half of the problem:

In the debug stack trace, we can observe that ScriptNode#childFileFolders(boolean files, boolean folders, Object ignoreTypes, int skipOffset, int maxItems, int requestTotalCountMax, String sortProp, Boolean sortAsc, String queryExecutionId) method is called. As part of this method we create a PagingRequest that we will pass it to fileFolderService to do it's job.

https://github.com/Alfresco/alfresco-community-repo/blob/ef441fc2c80fea43e70bbfa7171448729c8cb6cb/repository/src/main/java/org/alfresco/repo/jscript/ScriptNode.java#L718-L719

The PagingRequest object is defined by:

    private int skipCount = CannedQueryPageDetails.DEFAULT_SKIP_RESULTS;
    private int maxItems;
    
    private int requestTotalCountMax = 0; // request total count up to a given max (0 => do not request total count)
    private String queryExecutionId;

As mentioned above the request received from Share is of form: (This is the right formula, the above comment is wrong) skipOffset: 50 * (currentPageNumber -1) maxItems: 50 * currentPageNumber requestTotalCountMax: 1000 + skipCount

Let's take the example of Smart Folders on page 5: skipOffset:250 maxItems:300 requestTotalCountMax: 1250

As mentioned above SearchService query will actually return a number of elements defined by maxItems starting with the element at position skipCount +1.

This means that the query will return 300 results Now if we return to the second half described in this comment above:

        hasMore = true
        final int totalFirst = 300
        start = 250;
        totlaSecond = start + result.getNumberFound() + 1

After adding more info about the current behavior I will try to answer your questions.

I don't fully understand this change. The whole point of paging is to query only fractions of the entire set for each page. If I have 200 items in the result set, then I am only retrieving items 1..50 for the first page. Then on the second page items 51..100, and so on. With this change, I am loading items 1..200 for the first page, items 51..200 for the second page, etc. Then why having paging at all? With this change in place, it would be more efficient to not use paging at all and just load all items once.

As described above, the PagingRequest object contains both maxItems and requestTotalCountMax, based on the class javadoc and comments -> https://github.com/Alfresco/alfresco-community-repo/blob/ef441fc2c80fea43e70bbfa7171448729c8cb6cb/core/src/main/java/org/alfresco/query/PagingRequest.java

maxItems = the maximum size of the page / requestTotalCountMax = the request total count up to a given max (0 => do not request total count)

Until now, for Smart Folders we just ignored the value for PagingRequest.requestTotalCountMax, in my change I`m proposing to use the value for PagingRequest.requestTotalCountMax if is different from 0 instead of PagingRequest.maxItems.

Why do we want the query to return 1000 elements ?

That is the default Share behavior, even if it requires 50 elements for a page, it requests 1000 because of the element displaying the page numbering and items at the bottom of the page.

Epurashu avatar Apr 19 '21 11:04 Epurashu