elasticsearch-net icon indicating copy to clipboard operation
elasticsearch-net copied to clipboard

Successfull Bulk API Call produces invalid response

Open coding-bunny opened this issue 7 months ago • 0 comments

NEST/Elasticsearch.Net version: 7.17.5

Elasticsearch version: 8.10.4

.NET runtime version: .NET 5

Operating system version: Windows 11 / Linux

Description of the problem including expected versus actual behavior: We are using the Bulk Update API to upsert documents if they do not exist, or update if they already exist. When making the call manually using Postman for example, it works:

POST {{url}}/{{index}//_bulk
{"update":{"_id":"hu-person-511D0930","routing":"hu-company-ceg0502000478"}}
{"doc_as_upsert":true,"doc":{"companyUniqueId":"hu-company-ceg0502000478","personUniqueId":"hu-person-511D0930","name":"Balázs Porgányi","roles":[{"code":"EU_ESCO_cs_5141-1-1"}],"departments":[],"email":{"sources":[{"code":"hu-ceginfo","lastCheckedAt":"2023-04-11T18:27:00+00:00"}]},"joinType":{"name":"person","parent":"hu-company-ceg0502000478"}}}

The above returns a valid 200 response and the data is updated.

Steps to reproduce: The following code however does not work:

var response = await _client.BulkAsync(bulkDescriptor => bulkDescriptor
                .Index(Indices.Index(_options.CurrentIndexName(_code)))
                .UpdateMany(_documents, (bu, d) => bu.Doc(d).DocAsUpsert()),
            cancellationToken).ConfigureAwait(false);

        if (response.Errors)
        {
            throw new PusherException("Batch operation failed", response.ItemsWithErrors);
        }
        
        _documents.Clear();

Expected behavior I expect the client to behave the same as the actual API endpoint.

Provide DebugInformation (if relevant):

Successful (200) low level call on POST: /hu-search-1.0.0/_bulk?pretty=true&error_trace=true
# Audit trail of this API call:
 - [1] HealthyResponse: Node: https://7a2df7e122f14ead8216be4220ec3aa1.westeurope.azure.elastic-cloud.com/ Took: 00:00:00.0407435
# Request:
{"update":{"_id":"hu-eshop-8790","routing":"hu-company-ceg1309215125"}}
{"doc_as_upsert":true,"doc":{"companyUniqueId":"hu-company-ceg1309215125","eShopUniqueId":"hu-eshop-8790","url":"http://fenyek.hu","name":"fenyek.hu","establishedAt":"0001-01-01T00:00:00Z","eShopMetrics":{"trafficEstimate":15.0,"countryRank":538461,"paidKeywords":0},"technologies":["eshop-tech","unas-117o"],"paymentOptions":[],"deliveryOptions":["delivery-provider","gls-3699"],"sourceCodes":[],"joinType":{"name":"eshop","parent":"hu-company-ceg1309215125"}}}

# Response:
{
  "errors" : true,
  "took" : 5,
  "items" : [
    {
      "update" : {
        "_index" : "hu-search-1.0.0",
        "_type" : "_doc",
        "_id" : "hu-eshop-8790",
        "status" : 400,
        "error" : {
          "type" : "document_parsing_exception",
          "reason" : "[1:432] failed to parse field [joinType] of type [text] in document with id 'hu-eshop-8790'. Preview of field's value: '{parent=hu-company-ceg1309215125, name=eshop}'",
          "caused_by" : {
            "type" : "illegal_state_exception",
            "reason" : "Can't get text on a START_OBJECT at 1:381"
          }
        }
      }
    }
  ]
}

# TCP states:
  TimeWait: 55
  Established: 147
  CloseWait: 8

# ThreadPool statistics:
  Worker: 
    Busy: 0
    Free: 32767
    Min: 32
    Max: 32767
  IOCP: 
    Busy: 1
    Free: 999
    Min: 32
    Max: 1000

The above is the Debug output, showing that the client receives an HTTP 200 response actually. However, it seems that something goes wrong internally with the client preventing our update process from working right.

The connection is established like this:

                var connectionSettings = new ConnectionSettings(elasticSearchUrl)
                    .BasicAuthentication(options.Username, options.Password)
                    // Required for compatibility between the v7 client and the v8 cluster.
                    .EnableApiVersioningHeader()
                    .EnableDebugMode();

coding-bunny avatar Nov 20 '23 09:11 coding-bunny