elasticsearch-net icon indicating copy to clipboard operation
elasticsearch-net copied to clipboard

Bulk Update serializer does not serialize fields with null value

Open Henr1k80 opened this issue 5 years ago • 7 comments

NEST/Elasticsearch.Net version: 6.4.0 Elasticsearch version: 6.4.3 Description of the problem including expected versus actual behavior: Null value object and string fields are not serialized in the bulk update request, making it impossible to remove a previously set value, either string or object in both source field and index.

I have intercepted the request in Fiddler and seen that the field was left out.

I modified the request payload, including the field and setting the value to null and the field was of course set to null in source and the previously indexed value removed.

Example: A product exists in elastic eg.:

{
name: "Cap",
tag: "New",
campaign:{ text: "Halloween"},
stock: 15
}

The system for maintaining text is another than the one for stock, so partial updates are used for text and stock updates. An editor removes the New-tag and the campaign. If this is sent to elastic directly as a partial update, it works:

{
name: "Cap",
tag: null,
campaign: null
}

But if the POCO equivalent is sent in NEST Bulk UpdateMany, the null string and the null object is not serialized, resulting in this partial update that does not remove the tag or campaign.

{
name: "Cap"
}

Steps to reproduce:

  1. Create a partial model with some string and object fields
  2. Create a document with the values set
  3. Try to unset the values with a Bulk UpdateMany partial update.

The serializer seems to be the same for .Doc and .Upsert, so using .DocAsUpsert makes no difference.

Henr1k80 avatar Nov 13 '18 18:11 Henr1k80

I guess it makes sense to not send null value fields when creating the document, as it saves some space in the source field. But null value fields must be sent when updating as it is not given whether the field in the index contains a value or not.

Henr1k80 avatar Nov 14 '18 07:11 Henr1k80

I have tried to use the LowLevel client with PostData.MultiJson and that serializer has the same behaviour, I guess that is whats being used underneath:

            IEnumerable<object> operations = GetOperations();
            PostData postData = PostData.MultiJson(operations);
            BulkResponse bulkResponse = await ElasticClient.LowLevel.BulkAsync<BulkResponse>(postData);

Henr1k80 avatar Nov 14 '18 11:11 Henr1k80

Maybe some default/nullvalue attributes could be taking in to cosideration when making a fix for this. https://github.com/elastic/elasticsearch-net/issues/3336 They could be used to get the current broken behaivour if there is no need for partial updates.

Henr1k80 avatar Nov 26 '18 16:11 Henr1k80

Can you acknowledge this is an issue? Are you looking in to fixing it in the new serializer? If not, I suppose you should add it to a list of known bugs or "features" and how to handle it. It is annoying to discover this late in a project, have to fix this by serializing manually and discovering that the "easy" manual serialization is not so easy because the elastic serializer has some "features" in regards to standard JsonNet serialization in name casing.

Henr1k80 avatar Jan 15 '19 08:01 Henr1k80

Hi @Henr1k80 appologies for taking forever to respond to this ticket. Definitely fell through the cracks on our end.

The recommended way to deal with null updates is to either have a separate POCO for it with a different nullvalue handling. Or use painless scripts to set these to null.

Supporting different contexts for the same POCO is not something were likely to support as it makes pluggin a different serializer near enough impossible. Which ties into the same reason I was not keen on your suggestion in #3336.

Mpdreamz avatar Jan 15 '19 08:01 Mpdreamz

Hi @Mpdreamz, I think that recommendation should be added as a note here: https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/nest-getting-started.html#_indexing_2 I suspect most people can unset a value by simply setting it as null.

I know that fixing the inconsistent default value handling is a breaking change. I only suggested it to make it possible to keep the inconsistent default value handling for backward compatibility if you should choose to introduce consistent default value handling.

Henr1k80 avatar Jan 15 '19 08:01 Henr1k80

@Mpdreamz then what is the intended way of type safe updating multiple fields to null without reindexing the whole document? I know i script can do that. But I can not write typesafe scripts. If I change my document and forget about that script, my code fails silently. That could not be the intend for such a simple thing like update a field to null.

lanwin avatar Sep 17 '19 09:09 lanwin

@stevejgordon why is this closed as completed?

lanwin avatar Dec 15 '22 08:12 lanwin

@lanwin I should have selected "Closed as not planned". 6.x is unsupported, and for 7.x, we don't intend to change the behaviour.

stevejgordon avatar Dec 15 '22 08:12 stevejgordon

@stevejgordon can you place say something why this behavior is not a problem for the ES team? The current behavior is mostly unexpected for me and I am sure for most others C# devs too. I understand why its not possible to change the default (cause it would be a braking change) but it would be possible to add a property or method to change that behavior.

lanwin avatar Jan 26 '23 17:01 lanwin

@lanwin This issue relates to 6.x, which is out of support. For 7.x it's possible to switch to the Json.NET source serializer or a custom serializer and control the source serialization more explicitly, with attributes for null handling for example. That's the correct solution there.

stevejgordon avatar Jan 26 '23 20:01 stevejgordon