elasticsearch-reindexing
elasticsearch-reindexing copied to clipboard
Indexing just stops
Hi, I'm using ES 1.4.2, and a cluster of 2 data nodes, 6 shards, and about 1m docs in an index I'm trying to reindex. I invoke this reindexing plugin, and it runs for a while, then stops. there is no output in the log at all. It does not stop at the same place each time, sometimes it manages 72000docs, sometimes it makes 200k docs, but it never succeeds with the entire index. a subsequent GET _reindex yields { "acknowledged": true, "names": [] }
I have also noticed that if I invoke this plugin on a cluster that has shard allocation disabled, and I target an as yet non existent index, ES creates the new index properly, but the reindexing plugin cannot write anything until shard allocation is enabled again (of course) - but after that it still fails. it just sits there doing nothing at all.
Hope these issues can be identified and fixed. this plugin would be very useful if I could get it working properly.
Cheers
Does reindexing work on a cluster that has shard allocation enabled if you do not change the status for shard allocation when invoking reindexing?
Could you check a debug log for elasticsearch?
No. As I said it does not work even if I don't change the allocation status. Also as I said there are zero entries in the log. It just stops.
Could you provide a step to reproduce it? I do not have any problems for reindexing...
I wish I could. All I do is start indexing with post from/_reindex/to
It starts. Then it stops sometime later as though it finished properly.
My documents are quite large if that makes any difference.
so I resolved this, after checking the source I added scroll=10m&size=200 to the end of my reindex url. the time was the likely cause as it defaults to 1000 documents and 1minute max scroll time.
there are some places that errors can occur that will never be logged, nor anyone alerted to, and I had to change a few things in order to run the tests locally too.
I'll make the changes available when I get a chance if you're interested
Hello, I suffer from the same effect, but adding ?scroll=10m&size=200 didn't make it work.
Hi. I think this might be related to what we are also having on our production servers. e.g. we have found this in our elasticsearch log,
Here is the full traceback:
[2015-12-15 14:53:25,814][ERROR][action.bulk ] [Surge] unexpected error while replicating for action [indices:data/write/bulk[s]]. shard [[hep_v2][4]].
org.codelibs.elasticsearch.reindex.exception.ReindexingException: failure in bulk execution:
[95]: index [hep_v2], type [record], id [1398802], message [MergeMappingException[Merge failed with failures {[mapper [publication_info.recid] of different type, current_type [string], merged_type [lo
ng]]}]]
at org.codelibs.elasticsearch.reindex.service.ReindexingService$ReindexingListener$2.onResponse(ReindexingService.java:214)
at org.codelibs.elasticsearch.reindex.service.ReindexingService$ReindexingListener$2.onResponse(ReindexingService.java:209)
at org.elasticsearch.action.bulk.TransportBulkAction$2.finishHim(TransportBulkAction.java:358)
at org.elasticsearch.action.bulk.TransportBulkAction$2.onResponse(TransportBulkAction.java:330)
at org.elasticsearch.action.bulk.TransportBulkAction$2.onResponse(TransportBulkAction.java:319)
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicationPhase.doFinish(TransportReplicationAction.java:983)
at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicationPhase.doRun(TransportReplicationAction.java:836)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.finishAndMoveToReplication(TransportReplicationAction.java:530)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.performOnPrimary(TransportReplicationAction.java:608)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1.doRun(TransportReplicationAction.java:452)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
but meanwhile querying for the status at _reindex/<name> was simply returning:
{
"acknowledged": true,
"name": "6bedbfdd-228b-4822-bfcf-a437ff394a76",
"found": true
}
(This is with ES 2.1.0)
In our case the error was simply due to the fact that we initalized the new index with a mapping that was slightly incompatible with the previous index (which was using dynamic mapping for certain fields).
Could it be that @derjohn's error is similar?
Nevertheless I think it would be great if the error state would be, not only logged in, but also reported in the JSON response to _reindex/<name>: in this way it would be possible to pass on to clients the responsibility to deal with the failure.
Currently we have not found a way to distinguish between a very long reindexing and a failed one.
cc: @jalavik
Merge failed with failures {[mapper [publication_info.recid] of different type, current_type [string], merged_type [long]]}]]
The mapping handling for ES 2.x became strict. "recid" property must be string or long in all _type.
Yep. Indeed we fixed it locally for our own specific problem. The only issue with the elasticsearch-reindexing, though, was that there was no report of an error via the REST interface. The error is currently only logged in the Elasticsearch logs.
Did you run it with wait_for_completion=true?
I can't because it would timeout the REST request. (the index is several gigabytes of size). So what I am doing is a simple polling of the /_reindex/<name> handler.