ml-commons
ml-commons copied to clipboard
[BUG] mapper_parsing_exception: failed to parse field [embeddingVector] of type [knn_vector] in document with id 'xxx'. Preview of field's value: 'NaN'
What is the bug? A clear and concise description of the bug.
In Opensearch 2.12.0:
By using the Bulk operation on the Java client and IndexOperation to create or update documents, Preview of field's value: 'NaN'
exceptions will be encountered when calculating vectors using GPU nodes, and errors will still occur in single threads. However, writing the erroneous data again can be done normally.
And when the cluster uses CPU to calculate vectors, this problem will be solved. Therefore, I guess the reason for the error is that the GPU calculation vector is unstable, but I cannot confirm this.
Here is my Java code and detailed exception information:
Java Code:
private void sendOpenSearch(List<EntityDoc> docList) {
try {
List<BulkOperation> operationList = new ArrayList<>(docList.size());
for(EntityDoc doc : docList){
BulkOperation operation = new BulkOperation.Builder()
.index(new IndexOperation.Builder<>()
.index(opensearchProperty.getRefreshIndex())
.id(doc.getId())
.document(doc)
.build())
.build();
operationList.add(operation);
}
BulkRequest bulkRequest = new BulkRequest.Builder()
.index(opensearchProperty.getRefreshIndex())
.operations(operationList)
.build();
BulkResponse response = openSearchClient.bulk(bulkRequest);
if(response.errors()){
response.items().forEach( item ->{
if(null != item.error() && null != item.error().causedBy()){
log.error("Exception reason:{}",item.id(),item.error().causedBy().reason());
}
});
}
} catch (IOException e) {
log.error("OpenSearch IO Exception",e);
}
}
Exception:
2024-04-01 09:31:51,454 [org.springframework.amqp.rabbit.RabbitListenerEndpointContainer#3-2] [] ERROR c.c.c.t.a.c.OpenSearchTalentConsumer - OpenSearch保存数据失败
org.opensearch.client.opensearch._types.OpenSearchException: Request failed: [mapper_parsing_exception] failed to parse field [embeddingVector] of type [knn_vector] in document with id 'xxx'. Preview of field's value: 'NaN'
at org.opensearch.client.transport.rest_client.RestClientTransport.getHighLevelResponse(RestClientTransport.java:270)
at org.opensearch.client.transport.rest_client.RestClientTransport.performRequest(RestClientTransport.java:143)
at org.opensearch.client.opensearch.OpenSearchClient.update(OpenSearchClient.java:1578)
at com.ci.application.consumer.OpenSearchTalentConsumer.reSendOpensearch(OpenSearchTalentConsumer.java:97)
at com.ci.application.consumer.OpenSearchTalentConsumer.sendOpensearch(OpenSearchTalentConsumer.java:85)
at com.ci.application.consumer.OpenSearchTalentConsumer.consume(OpenSearchTalentConsumer.java:55)
at jdk.internal.reflect.GeneratedMethodAccessor1421.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:171)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:120)
at org.springframework.amqp.rabbit.listener.adapter.HandlerAdapter.invoke(HandlerAdapter.java:49)
at org.springframework.amqp.rabbit.listener.adapter.MessagingMessageListenerAdapter.invokeHandler(MessagingMessageListenerAdapter.java:190)
at org.springframework.amqp.rabbit.listener.adapter.MessagingMessageListenerAdapter.onMessage(MessagingMessageListenerAdapter.java:127)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:1552)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.actualInvokeListener(AbstractMessageListenerContainer.java:1478)
at jdk.internal.reflect.GeneratedMethodAccessor949.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.retry.interceptor.RetryOperationsInterceptor$1.doWithRetry(RetryOperationsInterceptor.java:91)
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:287)
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:180)
at org.springframework.retry.interceptor.RetryOperationsInterceptor.invoke(RetryOperationsInterceptor.java:115)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:212)
at org.springframework.amqp.rabbit.listener.$Proxy354.invokeListener(Unknown Source)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:1466)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:1461)
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:1410)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:870)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:854)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$1600(SimpleMessageListenerContainer.java:78)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.mainLoop(SimpleMessageListenerContainer.java:1137)
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1043)
at java.base/java.lang.Thread.run(Thread.java:829)
How can one reproduce the bug? Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
What is the expected behavior? A clear and concise description of what you expected to happen.
What is your host/environment?
- OS: Linux CentOS 7.9
- Version: 2.12.0
- Plugins: ml_commons
Do you have any screenshots? If applicable, add screenshots to help explain your problem.
Do you have any additional context? Add any other context about the problem.
Are you using any ml-commons feature to generate this embedding? Can you give more details how to reproduce this issue?
If you aren't using any models through ml-commons, may be we can move this issue to K-NN plugin?
Are you using any ml-commons feature to generate this embedding? Can you give more details how to reproduce this issue?
If you aren't using any models through ml-commons, may be we can move this issue to K-NN plugin?
I used my custom model.
May I ask what other information do I need to provide?
failed to parse field [embeddingVector] of type [knn_vector] in document with id 'xxx'. Preview of field's value: 'NaN'
From the error , you are going to save 'NaN' to knn_vector
field ?
failed to parse field [embeddingVector] of type [knn_vector] in document with id 'xxx'. Preview of field's value: 'NaN'
From the error , you are going to save 'NaN' to
knn_vector
field ?
No, my embeddingContent actually contains data, not NaN
. However, the value I obtained was NaN
, which led to an error in vector calculation. Yet, without any modifications, after increasing the number of client retries, this data can be saved normally.