elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

Using reindex api with slices in creating child documents have different routing id from parent id

Open shribigb opened this issue 7 years ago • 8 comments

Elasticsearch version (bin/elasticsearch --version): 6.1.3

JVM version (java -version): 1.8-56

OS version (uname -a if on a Unix-like system): Ubuntu 16 LTS

Description of the problem including expected versus actual behavior:

When creating child documents using reindex api with slicing and painless scripting, I noticed that value for routing id was different from the parent id. When I re-tried without slicing I noticed expected behavior. To isolate the issue, I created single node single shard index.

Steps to reproduce:

  1. I first indexed all the parent docuements. I made following call using reindex API
POST _reindex?slices=9&requests_per_second=-1&wait_for_completion=true
{
  "source": {
     "index": "index-cutover-v2",
     "type": "files",
     "query": {"terms": {"filesets": [31712]}}
  },  
  "script": {
    "source": "if (ctx._type == 'files') {ctx._source.file_flag_join = params.file_flag_join; ctx._id=ctx._source.id+'file'; ctx._type='doc' } else {ctx.op = 'noop'}",
    "lang": "painless",
    "params": {
      "file_flag_join" : {
        "name": "file"
      }
    }
  },
  "dest": {
    "index": "index-cutover-v5"
  }
}
  1. Once the reindexing was done I used _refresh API to make data available for searching.
  2. After step 2, I indexed child documents using following api call
POST _reindex?slices=9&requests_per_second=-1&wait_for_completion=true
{
  "source": {
     "index": "index-cutover-v2",
     "type": "flags",
     "query": {"terms": {"filesets": [31712]}}
  },  
  "script": {
    "source": "if (ctx._type == 'flags') {ctx._type='doc';ctx._id=ctx._source.id+'flag';ctx._routing=ctx._source.file+'file';ctx._source.file_flag_join = params.file_flag_join;ctx._source.file_flag_join.parent = ctx._routing;} else {ctx.op = 'noop'}",
    "lang": "painless",
    "params": {
      "file_flag_join" : {
        "name": "flag"
      }
    }
  },
  "dest": {
    "index": "index-cutover-v5"
  }
}
"
  1. After step 3 was finished I reran _refresh API.
  2. I tried to pull single flag document and I received following.
{
	"_index": "index-cutover-v5",
	"_type": "doc",
	"_id": "1186231450flag",
	"_score": 2,
	"_routing": "110486171file",
	"_source": {
		"severity": 4,
		"reason": "",
		"file_flag_join": {
			"parent": "110486274file",
			"name": "flag"
		},
		"file": 110486171,
		"offset": 10111689,
		"filesets": [
			29825,
			31712
		],
		"id": 1186231450
	}
}

If you notice, _routing and file_flag_join.parent have different values.

If I remove slicing and try again I dont see this behavior and I see consistency in the routing and parent values.

shribigb avatar Jan 31 '18 20:01 shribigb

The issue is caused by the inner map file_flag_join in the params of the script. If this inner map is copied in the context document and then modified to add new fields, this also modifies the original params. It works when slicing is off because you always override the new field with its new value but when multiple slices execute the same script they modify this inner map concurrently which leads to a race condition. The solution is to clone the map in the source like this: ctx._source.file_flag_join = new HashMap(params.file_flag_join)

We could force a deep copy of the params before execution or make the inner map immutable to avoid this trappy behavior, @nik9000 WDYT ?

jimczi avatar Feb 01 '18 11:02 jimczi

I think it'd be breaking to make those maps immutable but it is pretty tempting. It feels cleaner than making a copy every time though.

nik9000 avatar Feb 01 '18 21:02 nik9000

@nik9000 what should we do here? this feels like serious bug?

bleskes avatar May 22 '18 09:05 bleskes

Either of the things I suggested should work to be honest. It might be better to make a script context for reindex and see if that solves the problem. It might. And if it doesn't we can build on it from there.

nik9000 avatar May 23 '18 16:05 nik9000

Pinging @elastic/es-distributed

elasticmachine avatar Apr 12 '19 07:04 elasticmachine

@henningandersen and I discussed this issue during a triage session and we think this issue should be moved to the script label as the issue seems to be related to how the script is set up using a non thread-safe data structure in the script (maybe this should be forbidden?)

fcofdez avatar Aug 22 '22 09:08 fcofdez

Pinging @elastic/es-core-infra (Team:Core/Infra)

elasticsearchmachine avatar Aug 22 '22 09:08 elasticsearchmachine

This is an on-going issue with params, we should provide a way for users to access params in a safe way. Since we have metadata() and field(<path>) as replacements for ctx, perhaps a fields-like params API would be worthwhile.

stu-elastic avatar Sep 21 '22 15:09 stu-elastic