hibernate-test-case-templates
hibernate-test-case-templates copied to clipboard
HSEARCH-4487 - JSR-352 mass indexing job fails with `@IndexedEmbedded` entity and MySQL
Not sure what's going on on CI; I suppose Ryuk requires additional configuration. Let's ignore that. I ran the reproducer locally.
First, this warning is suspicious; I'd suggest to check how you create your schema:
08:57:44,345 (main) WARN ExceptionHandlerLoggedImpl:27 - GenerationTarget encountered exception accepting command : Error executing DDL "
alter table YourAnnotatedEntity
drop
foreign key FK1rvs28x1ncle85qk03q3gyy41" via JDBC Statement
org.hibernate.tool.schema.spi.CommandAcceptanceException: Error executing DDL "
alter table YourAnnotatedEntity
drop
foreign key FK1rvs28x1ncle85qk03q3gyy41" via JDBC Statement
at org.hibernate.tool.schema.internal.exec.GenerationTargetToDatabase.accept(GenerationTargetToDatabase.java:67)
at org.hibernate.tool.schema.internal.SchemaDropperImpl.applySqlString(SchemaDropperImpl.java:387)
at org.hibernate.tool.schema.internal.SchemaDropperImpl.applySqlStrings(SchemaDropperImpl.java:371)
at org.hibernate.tool.schema.internal.SchemaDropperImpl.applyConstraintDropping(SchemaDropperImpl.java:341)
at org.hibernate.tool.schema.internal.SchemaDropperImpl.dropFromMetadata(SchemaDropperImpl.java:235)
at org.hibernate.tool.schema.internal.SchemaDropperImpl.performDrop(SchemaDropperImpl.java:156)
at org.hibernate.tool.schema.internal.SchemaDropperImpl.doDrop(SchemaDropperImpl.java:128)
at org.hibernate.tool.schema.internal.SchemaDropperImpl.doDrop(SchemaDropperImpl.java:114)
at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.performDatabaseAction(SchemaManagementToolCoordinator.java:153)
at org.hibernate.tool.schema.spi.SchemaManagementToolCoordinator.process(SchemaManagementToolCoordinator.java:81)
at org.hibernate.internal.SessionFactoryImpl.<init>(SessionFactoryImpl.java:335)
at org.hibernate.boot.internal.SessionFactoryBuilderImpl.build(SessionFactoryBuilderImpl.java:471)
at org.hibernate.search.bugs.SearchTestBase.setUp(SearchTestBase.java:31)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
Caused by: java.sql.SQLSyntaxErrorException: Table 'test.yourannotatedentity' doesn't exist
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:763)
at com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:648)
at org.hibernate.tool.schema.internal.exec.GenerationTargetToDatabase.accept(GenerationTargetToDatabase.java:54)
... 42 more
That being said, the error doesn't seem related:
08:57:46,863 (pool-2-thread-1) ERROR SqlExceptionHelper:142 - Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@153fac61 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
08:57:46,864 (pool-2-thread-1) INFO DefaultLoadEventListener:116 - HHH000327: Error performing load command
org.hibernate.exception.GenericJDBCException: could not extract ResultSet
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:99)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:67)
at org.hibernate.loader.plan.exec.internal.AbstractLoadPlanBasedLoader.getResultSet(AbstractLoadPlanBasedLoader.java:390)
at org.hibernate.loader.plan.exec.internal.AbstractLoadPlanBasedLoader.executeQueryStatement(AbstractLoadPlanBasedLoader.java:163)
at org.hibernate.loader.plan.exec.internal.AbstractLoadPlanBasedLoader.executeLoad(AbstractLoadPlanBasedLoader.java:104)
at org.hibernate.loader.entity.plan.AbstractLoadPlanBasedEntityLoader.load(AbstractLoadPlanBasedEntityLoader.java:285)
at org.hibernate.persister.entity.AbstractEntityPersister.doLoad(AbstractEntityPersister.java:4521)
at org.hibernate.persister.entity.AbstractEntityPersister.load(AbstractEntityPersister.java:4511)
at org.hibernate.event.internal.DefaultLoadEventListener.loadFromDatasource(DefaultLoadEventListener.java:571)
at org.hibernate.event.internal.DefaultLoadEventListener.doLoad(DefaultLoadEventListener.java:539)
at org.hibernate.event.internal.DefaultLoadEventListener.load(DefaultLoadEventListener.java:208)
at org.hibernate.event.internal.DefaultLoadEventListener.proxyOrLoad(DefaultLoadEventListener.java:327)
at org.hibernate.event.internal.DefaultLoadEventListener.doOnLoad(DefaultLoadEventListener.java:108)
at org.hibernate.event.internal.DefaultLoadEventListener.onLoad(DefaultLoadEventListener.java:74)
at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:118)
at org.hibernate.internal.SessionImpl.fireLoadNoChecks(SessionImpl.java:1226)
at org.hibernate.internal.SessionImpl.internalLoad(SessionImpl.java:1091)
at org.hibernate.type.EntityType.resolveIdentifier(EntityType.java:699)
at org.hibernate.type.EntityType.resolve(EntityType.java:464)
at org.hibernate.type.ManyToOneType.resolve(ManyToOneType.java:240)
at org.hibernate.engine.internal.TwoPhaseLoad$EntityResolver.lambda$static$0(TwoPhaseLoad.java:576)
at org.hibernate.engine.internal.TwoPhaseLoad.initializeEntityEntryLoadedState(TwoPhaseLoad.java:221)
at org.hibernate.engine.internal.TwoPhaseLoad.initializeEntity(TwoPhaseLoad.java:155)
at org.hibernate.engine.internal.TwoPhaseLoad.initializeEntity(TwoPhaseLoad.java:126)
at org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:1201)
at org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:1153)
at org.hibernate.loader.Loader.loadSingleRow(Loader.java:415)
at org.hibernate.internal.ScrollableResultsImpl.prepareCurrentRow(ScrollableResultsImpl.java:197)
at org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:102)
at org.hibernate.search.batch.jsr352.core.massindexing.step.spi.EntityReader$ChunkState.next(EntityReader.java:446)
at org.hibernate.search.batch.jsr352.core.massindexing.step.spi.EntityReader.readItem(EntityReader.java:263)
at com.ibm.jbatch.container.artifact.proxy.ItemReaderProxy.readItem(ItemReaderProxy.java:68)
at com.ibm.jbatch.container.impl.ChunkStepControllerImpl.readItem(ChunkStepControllerImpl.java:321)
at com.ibm.jbatch.container.impl.ChunkStepControllerImpl.readAndProcess(ChunkStepControllerImpl.java:255)
at com.ibm.jbatch.container.impl.ChunkStepControllerImpl.invokeChunk(ChunkStepControllerImpl.java:622)
at com.ibm.jbatch.container.impl.ChunkStepControllerImpl.invokeCoreStep(ChunkStepControllerImpl.java:763)
at com.ibm.jbatch.container.impl.BaseStepControllerImpl.execute(BaseStepControllerImpl.java:149)
at com.ibm.jbatch.container.impl.ExecutionTransitioner.doExecutionLoop(ExecutionTransitioner.java:112)
at com.ibm.jbatch.container.impl.JobThreadRootControllerImpl.originateExecutionOnThread(JobThreadRootControllerImpl.java:110)
at com.ibm.jbatch.container.util.BatchWorkUnit.run(BatchWorkUnit.java:80)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.sql.SQLException: Streaming result set com.mysql.cj.protocol.a.result.ResultsetRowsStreaming@153fac61 is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1009)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:57)
... 43 more
It would seem a "streaming result set" was not closed properly, resulting in errors at the JDBC level.
The thing is, we're not supposed to use a "streaming result set" when loading entities. And in fact... this was caused by your configuration:
MassIndexingJob.ParametersBuilder parametersBuilder = MassIndexingJob.parameters()
.forEntities(YourAnnotatedEntity.class)
.idFetchSize(Integer.MIN_VALUE)
.entityFetchSize(Integer.MIN_VALUE);
I don't know why you're calling .entityFetchSize(Integer.MIN_VALUE)
here. It has a specific meaning in MySQL, but that's definitely not what you want. You want to provide an actual, positive value here: the amount of entities to fetch in each batch.
If you use an entity fetch size of 50
instead of Integer.MAX_VALUE
, the problem will go away in your reproducer:
MassIndexingJob.ParametersBuilder parametersBuilder = MassIndexingJob.parameters()
.forEntities(YourAnnotatedEntity.class)
.idFetchSize(Integer.MIN_VALUE)
- .entityFetchSize(Integer.MIN_VALUE);
+ .entityFetchSize(50);
So... I'd be tempted to classify this as a db-specific usage problem, rather than a bug in Hibernate Search?
Thanks a lot for looking into this, Yoann. I was just about to properly file a JIRA issue but let me know if we can discuss it here.
alter table YourAnnotatedEntity
drop
foreign key FK1rvs28x1ncle85qk03q3gyy41" via JDBC Statement
org.hibernate.tool.schema.spi.CommandAcceptanceException: Error executing DDL "
alter table YourAnnotatedEntity
drop
foreign key FK1rvs28x1ncle85qk03q3gyy41" via JDBC Statement
I believe this is caused by the hibernate.hbm2ddl.auto create-drop
property which at the beginning of the test for some reason first attempts to drop the foreign key constraint. I think we can safely ignore this for now.
I don't know why you're calling .entityFetchSize(Integer.MIN_VALUE) here. It has a specific meaning in MySQL, but that's definitely not what you want. You want to provide an actual, positive value here: the amount of entities to fetch in each batch.
This is just for demonstration purposes. The idea is that one may work with some large partitions and sinceentityFetchSize(Integer.MIN_VALUE)
has a special meaning to tell the driver to stream the results back one row at a time, such configuration may be useful.
From what I understand from MySQL Connector/J documentation there's no way to fetch X number of entities when using scrolls - either get them all or stream one at a time.
By the way, the issue is not observed if there's no @IndexedEmbedded
used. I guess this is caused by the fact that the association is attempted to be fetched on the same connection.
Let me know if that makes sense. Thanks!
Ok, I just checked the code, and we're indeed using scrolls to retrieve entities from the database. Which, given the limitations of MySQL JDBC drivers (not being able to retrieve rows in batches of any given size N), means you have no other option than using a fetch size of Integer.MIN_VALUE
, indeed. Which won't work because MySQL JDBC drivers don't support streaming result and performing other queries (e.g. retrieving associations) on the same connection.
So, the current design won't work with MySQL. Or rather, it cannot scale very well, since the only way to make it work is to load all entities of a given partition at once.
That being said, one solution would be for you to set a rather small partition size, e.g. .rowsPerPartition(50)
. But I suspect that won't perform well either, due to the overhead involved in having many (thousands of) partitions.
Anyway, back to your reproducer... yes, we have a problem. Can you please open a JIRA?
I suspect we'll have to move to a solution more similar to the MassIndexer
, where we load ids in a scroll, but entities in normal queries. That'll mean opening twice as many database connections for the same job, though, which is not great :/ I'll have to think about it.
I suspect we'll have to move to a solution more similar to the
MassIndexer
, where we load ids in a scroll, but entities in normal queries. That'll mean opening twice as many database connections for the same job, though, which is not great :/ I'll have to think about it.
An alternative would be to use queries that retrieve the N next entities. Something like this:
select e from MyEntity e where e.id > :idOfLastEntityOfPreviousChunk and e.id < :partitionUpperBoundId order by e.id limit 50
(with 50 being the entity fetch size)
This might perform adequately with the appropriate database indexes on the ID column(s).
Thanks Yoann!
Yes, for the time being I think we can just try with smaller partitions and see how that goes.
Thanks for the test case and the report. This was addressed in 7.0.0.CR1 release: https://hibernate.atlassian.net/projects/HSEARCH/versions/32189/tab/release-report-all-issues. Closing the PR.