spring-batch
spring-batch copied to clipboard
PagingQueryProvider improvements to query specific page
In commit 220de61, the method generateJumpToItemQuery() was removed from PagingQueryProvider class, with the following message: "Remove un-used code for jump-to-item queries". Well, currently, Spring Cloud Data Flow invokes that method to generate the query for a specific page in the database, so they will be impacted when they upgrade to Spring Batch 5.
Anyway, what is now the expected usage to generate the query for remaining pages? I see no easy way to access to a random page (e.g. page 7). Should I call generateRemainingPagesQuery() repeteadly in streaming fashion? Note also that the Javadoc does not make that clear. In fact, the documentation for generateFirstPageQuery and generateRemainingPagesQuery is currently the same...
In Oracle, for example, the current implementation (OraclePagingQueryProvider) is based on legacy ROWNUM syntax, which is deprecated since Oracle 12c. More easily, OFFSET and FETCH clauses could be used (ANSI SQL 2016 compliant), and have a single method in PagingQueryProvider() to generate the query for the specific page. A similar solution could be applied in other databases too.
public class Oracle19PagingQueryProvider extends PagingQueryProvider {
public String generatePageQuery(int itemIndex, int pageSize) {
StringBuilder sql = new StringBuilder();
sql.append("SELECT ").append(this.getSelectClause());
sql.append(" FROM ").append(this.getFromClause());
sql.append(this.getWhereClause() == null ? "" : " WHERE " + this.getWhereClause());
buildGroupByClause(sql);
sql.append(" ORDER BY ").append(SqlPagingQueryUtils.buildSortClause(this));
sql.append(" OFFSET ").append(itemIndex);
sql.append(" ROWS FETCH NEXT ").append(pageSize).append(" ROWS ONLY");
return sql.toString();
}
private void buildGroupByClause(StringBuilder sql) {
if(StringUtils.hasText(getGroupClause())) {
sql.append(" GROUP BY ");
sql.append(getGroupClause());
}
}
}
Thank you for opening this issue.
In commit https://github.com/spring-projects/spring-batch/commit/220de61f752908957286f017007123a92e0ce4b6, the method generateJumpToItemQuery() was removed from PagingQueryProvider class, with the following message: "Remove un-used code for jump-to-item queries". Well, currently, Spring Cloud Data Flow invokes that method to generate the query for a specific page in the database, so they will be impacted when they upgrade to Spring Batch 5.
PagingQueryProvider#generateJumpToItemQuery is not called by production code in Spring Batch, so even if a client implements or overrides that method, it won't be used anyway. If the PagingQueryProvider is used as a regular Java class outside of the framework, then we can do nothing for that. I will check that point with the SCDF team.
Anyway, what is now the expected usage to generate the query for remaining pages?
The expected way to generate the query for remaining pages is to implement generateRemainingPagesQuery
I see no easy way to access to a random page (e.g. page 7). Should I call generateRemainingPagesQuery() repeteadly in streaming fashion?
Are you using that class outside of Spring Batch? PagingQueryProvider is an interface you implement to override the default behaviour of Spring Batch, but it is not intended to be used by clients directly. It seems like you are using Spring Batch as a library, not a framework. Is that the case? You are not expected to use the PagingQueryProvider to go to a specific page, it is rather Spring Batch that will use generateFirstPageQuery and generateRemainingPagesQuery to drive the reader as needed when processing the datasource.
the documentation for generateFirstPageQuery and generateRemainingPagesQuery is currently the same...
That should be fixed, thank you for reporting it. Please open a separate issue for that, I would like to keep this issue focused on the query provider enhancement.
In Oracle, for example, the current implementation (OraclePagingQueryProvider) is based on legacy ROWNUM syntax, which is deprecated since Oracle 12c. More easily, OFFSET and FETCH clauses could be used (ANSI SQL 2016 compliant), and have a single method in PagingQueryProvider() to generate the query for the specific page.
Sure! If there is a better way of to implement that method, then we welcome contributions for that.
we have used generateJumpToItemQuery() in our project and trying to find alternative solution for the removal. Could anyone suggest other options please?