spring-batch icon indicating copy to clipboard operation
spring-batch copied to clipboard

JdbcPagingItemReader strange behaviour for record containing square symbol (²) [BATCH-2356]

Open spring-projects-issues opened this issue 10 years ago • 5 comments
trafficstars

Driss Amri opened BATCH-2356 and commented

Our application suddenly started throwing unique key constraints after we didn't change the code for months. The behaviour we saw was that all items in our first chunk of records/our first page were being read more than one time using the JdbcPagingItemReader.

The only thing that changed was the data we were reading, which had a strange record containing a character ² (square). When we excluded this record we didn't have any issues at all anymore.

We suddenly started seeing our Spring batch application trying to read the same records/pages twice with the JdbcPagingItemReader.

  public PagingQueryProvider queryProvider(DataSource dataSource) {
    SqlPagingQueryProviderFactoryBean factory = new SqlPagingQueryProviderFactoryBean();
    factory.setDataSource(dataSource);
    factory.setDatabaseType("SQLSERVER");
    factory.setSelectClause("SELECT projectnr");
    factory.setFromClause("FROM (SELECT DISTINCT projectnr FROM AQF_OZP) AQF_OZP");
    factory.setWhereClause("WHERE projectnr IS NOT NULL");
    factory.setSortKey("projectnr");

    try {
      return factory.getObject();
    } catch (Exception e) {
      throw new RuntimeException("Application intialization failed");
    }
  }

After we changed the query to have a WHERE clause to that didn't include this strange record with square symbol it worked again like it always has:

factory.setWhereClause("WHERE projectnr IS NOT NULL AND projectnr != '²21369'");

There was no exception in the reader phase, only after processing/writer we noticed this since our constraints were being triggered for proccesing the same input.


No further details from BATCH-2356

spring-projects-issues avatar Feb 25 '15 08:02 spring-projects-issues

Michael Minella commented

What database? What encoding? The table definition would also be useful to help debug. However, my gut feeling here is that this isn't an issue with the batch code (since we really don't do anything beyond blindly reading the data) but a db/sql issue...

spring-projects-issues avatar Feb 25 '15 08:02 spring-projects-issues

Driss Amri commented

We are using Microsoft SQL Server 2008 R2, encoding seems to be: SQL_Latin1_General_CP850_CI_AS

I can see the record and the correct name when I'm debugging (breakpoint in the doRead method), so it is being correctly read, but for some strange reason when this String is included in the result set, the reader goes crazy and reads all records more than once.

We process our records parallel but disabled it now to troubleshoot this and the behaviour is same for synchronous and asynchronous processing.

spring-projects-issues avatar Feb 25 '15 08:02 spring-projects-issues

Michael Minella commented

Can you provide the table definition?

spring-projects-issues avatar Feb 25 '15 09:02 spring-projects-issues

Driss Amri commented

ORDINAL_POSITION	COLUMN_NAME	DATA_TYPE CHARACTER_MAXIMUM_LENGTH
1	Projectnr	varchar	12
2	Status	smallint	<null>
3	Type	varchar	4
4	Eigenaar	varchar	3
5	Toestand	varchar	4
6	GemNR	varchar	25
7	DossierNR	varchar	25
8	SoortWater	varchar	3
9	LabelNaam	varchar	50
10	LabelX	float	<null>
11	LabelY	float	<null>
12	LabelSize	int	<null>
13	BovGem	varchar	50
14	MI_STYLE	varchar	254
15	MI_PRINX	int	<null>

spring-projects-issues avatar Mar 03 '15 03:03 spring-projects-issues

Thank you for opening the issue. Can you retry with the latest release of Spring Batch(5.0.2) and report back the results?
If the issue is reproducible, can you provide a sample project that uses the latest release of Spring Batch and that exhibits the behavior? To help you in reporting your issue, we have prepared a project template that you can use as a starting point. Please check the Issue Reporting Guidelines for more details about this.

cppwfs avatar Aug 08 '23 12:08 cppwfs