elasticsearch
elasticsearch copied to clipboard
semantic_text ingestion inference integration test
Created an IT for bulk ingestion using semantic_text.
I've done an IT that mixes bulk operations (index, update, upsert) on an index, and tests the number of documents in the index. I've relied heavily on randomness to get test coverage.
@elasticmachine run elasticsearch-ci/part-1
Pinging @elastic/es-search (Team:Search)
Have you tested this multiple times locally? I ran it about 100 times and got a couple of failures.
org.elasticsearch.xpack.inference.action.filter.ShardBulkInferenceActionFilterIT > testBulkOperations {seed=[B370AC9CED1A493:A83B5D3F6BB59398]} FAILED
java.lang.AssertionError: Failed to index document 240: org.elasticsearch.index.mapper.DocumentParsingException: [14:1] failed to parse field [dense_field] of type [semantic_text] in document with id '240'. Preview of field's value: 'null'
at __randomizedtesting.SeedInfo.seed([B370AC9CED1A493:A83B5D3F6BB59398]:0)
at org.elasticsearch.test.ESTestCase.fail(ESTestCase.java:2175)
at org.elasticsearch.xpack.inference.action.filter.ShardBulkInferenceActionFilterIT.testBulkOperations(ShardBulkInferenceActionFilterIT.java:112)
Caused by:
org.elasticsearch.index.mapper.DocumentParsingException: [14:1] failed to parse field [dense_field] of type [semantic_text] in document with id '240'. Preview of field's value: 'null'
at app//org.elasticsearch.index.mapper.FieldMapper.rethrowAsDocumentParsingException(FieldMapper.java:233)
at app//org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:186)
at app//org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:417)
at app//org.elasticsearch.index.mapper.DocumentParser.doParseObject(DocumentParser.java:483)
at app//org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:471)
at app//org.elasticsearch.index.mapper.DocumentParser.innerParseObject(DocumentParser.java:338)
at app//org.elasticsearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:299)
at app//org.elasticsearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:139)
at app//org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:86)
at app//org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:92)
at app//org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:1038)
at app//org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:979)
at app//org.elasticsearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:923)
at app//org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:374)
at app//org.elasticsearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:230)
at app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
at app//org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:300)
at app//org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:151)
at app//org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:79)
at app//org.elasticsearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:217)
at app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
at app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
at app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
at app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
at java.base@21/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base@21/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base@21/java.lang.Thread.run(Thread.java:1583)
Caused by:
java.lang.IllegalArgumentException: The [cosine] similarity does not support vectors with zero magnitude. Preview of invalid vector: [0.0]
at org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper$ElementType$2.checkVectorMagnitude(DenseVectorFieldMapper.java:569)
at org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper$ElementType$2.parseKnnVectorAndIndex(DenseVectorFieldMapper.java:593)
at org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper.parseKnnVectorAndIndex(DenseVectorFieldMapper.java:1463)
at org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapper.parse(DenseVectorFieldMapper.java:1456)
at org.elasticsearch.xpack.inference.mapper.SemanticTextFieldMapper.parseCreateField(SemanticTextFieldMapper.java:245)
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:184)
... 25 more
Have you tested this multiple times locally? I ran it about 100 times and got a couple of failures.
I did, using the @Repeat
annotation - and didn't catch those. I was able to reproduce as well :( Thanks!