RepoDB Feature: Introduce the 'SkipQuery' operation.

Description

This operation will work like 'BatchQuery' however, instead of paging, it will use the skipping.

For BatchQuery, the codes below will return the 20th to 30th rows.

using (var connection = new SqlConnection(ConnectionString))
{
	var page = 2;
	var rowsPerBatch = 10;
	var orderBy = OrderField.Parse(new { Id = Order.Ascending });
	var customers = connection.BatchQuery<Customer>(page, rowsPerBatch, orderBy);
}

For SkipQuery, the codes below will return the 3rd to 12th rows.

using (var connection = new SqlConnection(ConnectionString))
{
	var skip = 2;
	var take = 10;
	var orderBy = OrderField.Parse(new { Id = Order.Ascending });
	var customers = connection.SkipQuery<Customer>(skip, take, orderBy);
}

Acceptance Criteria

Unit Tests must be written (SqlServer, MySql, SqLite, PostgreSql).
Integration Tests must be written (SqlServer, MySql, SqLite, PostgreSql).

Impact

Less impact. Should not affect existing functionality.

Jan 25 '20 12:01 mikependon

This could be potentially used to support Cursor based paging as laid out by Relay spec (extension of GraphQL) for cursor based movemeent forward and backward through a result set. This would be more flexible than the current page BatchQuery (though even that could work with the right math).

Relay Cursor Paging Spec - Arguments

As an API this the Relay spec refers to this as Slicing, and cursor paging spec calls for forward, backward, and combined cursor movement using: first, after, last, before

But, as the RepoDb BatchQuery exists now, it won't work because the key element for this to work in a general way is that the Cursor needs to returned in the results! Without the Cursor value as derived from the index of each record in a sorted resultset; meaning the actual RowNumber needs to be part of the results.

And ideal solution would be a QuerySlice() method that:

Does not require altering the Model (as the cursor is really not Model data, but an element of the sorted results).
Returns a decorated result set of the type of the Model (or dynamic) along with the Cursor of each result

Note: as a generic solution, the cursor is just a derivation of the index of each record in the sorted order (it's the RowNumber() encoded as Base64).

Allows consumers to use the docorated results for any implementation they want -- for example to drive the Cursor pagination of GraphQL.

I have already been working on a POC for this as an Extension method and my initial version is currently it working great for Sql Server using a customize version of the exsting Batch Query sql that adds [RowNumber] as CursorIndex to the Select fields with a model that has this field myModel.CursorIndex on the Model -- it's working well, but that isn't ideal.

Anything more ideal requires access to some internal scoped elements of RepoDb -- unless I resort to flat out pure copy/paste of large parts (which it not the solution).

Since @mikependon just pointed me to this Feature, I just wanted to put this idea forth as an expansion of this Feature to see what thoughts there may be?

Oct 19 '20 03:10 cajuncoding

Linking this User Story (#634)

Nov 01 '20 13:11 mikependon

Also as a note on this particular feature request for "SkipQuery"... it's really just a variation of BatchQuery without the page calculation -- both can be mapped to starting row index, and ending row index in a sorted set of results.. So, if re-factored correctly the very same underlying queries can work for both..

Nov 01 '20 21:11 cajuncoding

@cajuncoding - I remembered talking to a colleague about considering the SkipQuery implementation into the BatchQuery operation, I guess it was last year or early this year. But I personally ended not to proceed with the change due to its internal usage in the organization (that time RepoDB is not yet fully publicly exposed).

Maybe this is a good question to ask to the community, should we still implement the SkipQuery separately? Or, should we just simply move the SkipQuery behavior into the existing BatchQuery?

Nov 01 '20 21:11 mikependon

@mikependon Yeah, it's an interesting question, and for example impacts the work I'm doing on the HotChocolate.RepoDb.SqlServer package.... The HotChocolate team supports both cursor & skip style of paging they call "Offset Paging". For the "Offset Paging" they chose to stick with the Skip/Take pattern also. So I"m having to map that mathmatically to a Page/Page-size (e.g. Batch) to use the existing API's -- which I def. want to do :-)

While I personally have always been fan of the Page/Page-size parameter set (eg. batches), it does seem that the industry prefers the Skip/Take... probably because it's a slicing approach whereby there really isn't a specific Page. Though Page/Page-size (or rows-per-batch as in RepoDb) can be mapped to a slice, mapping the other way from Skip/Take back to a Page/PageSize only works if we assume that the Take parameter is constant as the client pages through results. But I guess this is not a requirement for Skip/Take.

So as I"m working on adding OffsetPaging support in HotChocolate.RepoDb.SqlServer, having a directry Skip/Take api in RepoDb would make it alot easier, as of now I'm convering the Skip to a Page using math (as stated above, assuming the Take is likely to remain constant).

I can see that RepoDb is getting harder and harder to maintain as the code base grows due to support for a variety of DB implementations, both Synchronous & Asynchronous operations, and a great API.

My 2-cents would be to add it, but do so in a way where it's just a layer on top of the same underlying queries (keep these queries DRY) for both Batch and Skip/Take. For example, with Sql Servder my research shows there to be no notable performance benefit to the Offset query syntax vs RowOver()... and RowOver() might even be faster. But, having a Skip/Take api exposed will make code of users of RepoDb cleaner and more descriptive -- and likely help alot of novice developers.

Nov 01 '20 22:11 cajuncoding

Cool, thanks for sharing your opinion. Any breaking change (even it is minor), like this would be a part of the next version >= 1.13.x.

Nov 01 '20 22:11 mikependon

I was thinking it wouldn't be a breaking change at all, just a couple of net new apis taking in the skip/take parameters and leveraging the same internal plumbing (slight refactoring), but def. keep the current batch apis fully compatible with current behavior:

Net new api (and overloads):

BatchSkipQuery(skip, take,...)

Same for cursor slice paging with a net new api and overloads:

BatchSliceQuery(after, before, first, last,...)

Nov 01 '20 23:11 cajuncoding

Hello,

I am interested in grabing this feature as it looks pretty simple. Disclaimer : this is my first ever contribution to open source so please excuse me if I'm not following a rule. I did read the md's as for what to do and not do (hence this message ^^)

While waiting for a response, I will dig in the project and tests.

Dec 24 '22 14:12 ngardon

Cool! Looking forward for your contribution to RepoDB. It is highly appreciated! We suggest to follow the batch query implementation and structures. :)

Dec 25 '22 06:12 mikependon

I am running into an issue in the integration tests.

In Helper.AssertPropertiesEquality, ColumnTimestampWithTimeZone doesn't have the same value for expected and actual. In debug I see the same value for ColumnTimestampWithTimeZone and ColumnTimestampWithoutTimeZone in the case of actual.

Thing is I am in UTC+1, so the values from the database are 1 hour apart. Am I missing something ? Postgres v. 15.1 in a Debian container, MacOs as host system

Dec 25 '22 12:12 ngardon

I could not check for now due to being off from my laptop, but timezone should not be a problem here, it should be handled accordingly. You can just proceed with the PR and we will debug this Integration Tests issue in the coming days.

Dec 25 '22 13:12 mikependon

Done in #1123

Dec 25 '22 20:12 ngardon

Done in #1123

Great and thanks for the PR. We will review ASAP

Dec 26 '22 08:12 mikependon

The fixes to this will be available on the release version >= v1.13.1.

Mar 16 '23 21:03 mikependon

RepoDB RepoDB copied to clipboard

Feature: Introduce the 'SkipQuery' operation.

Description

Acceptance Criteria

Impact

RepoDB
RepoDB copied to clipboard