Elasticquent icon indicating copy to clipboard operation
Elasticquent copied to clipboard

Most recent commit broke bulk indexing

Open andrewtweber opened this issue 6 years ago • 5 comments

Commented here: https://github.com/elasticquent/Elasticquent/pull/148#issuecomment-515542710

The last commit which merged PR #148 broke bulk indexing completely.

I'm pretty sure this is broken. You're calling chunk on a Collection, which does not take a callback argument. Only chunk on Query Builders takes a callback.

Collection method -> only argument is $size Query Builder method -> takes both $size and $callback

Was this tested? The callback never gets entered, because it doesn't accept a callback as a parameter. At least when I've tried Model::addAllToIndex()

I also strongly disagree with implementing this in this way. If somebody wants to index all in one go, they should be able to, and damn the (lack of memory) consequences. If they want to chunk it, they can do that on their own. There's no need to chunk for them

Model::chunk(500, function (ElasticquentCollection $chunk) { $chunk->addToIndex(); });

^ that is what I was doing before, which this PR broke

andrewtweber avatar Jul 29 '19 18:07 andrewtweber

Did you find any solution for above issue? i am also stuck in this suddenly elastic search stops indexing.

riteshkhatri avatar Aug 05 '19 07:08 riteshkhatri

Opened a PR for this, and there is another PR open I just saw.

I basically replied in PR #148 , but I believe there needs to be a chunking when sending data to ES, based on my past experience, because without it, it will just end up failing on popular services like AWS once you reach a certain size of data. It seems dumb not to address this in the package and leave it to users to figure it out on their own when they end up in this situation.

specialtactics avatar Aug 05 '19 11:08 specialtactics

Did you find any solution for above issue? i am also stuck in this suddenly elastic search stops indexing.

@riteshkhatri Simplest quick solution is to specify the exact commit in your composer.json file

{
    "require": {
        "elasticquent/elasticquent": "dev-master#7f93c6c"
    }
}

andrewtweber avatar Aug 05 '19 12:08 andrewtweber

@specialtactics Thanks for the AWS explanation but it's likely that users will run out of memory too, if they try Model::addAllToIndex.

Chunking the query results solves both problems, and I think simply adding this example to the readme is better than trying to chunk it for them.

Model::chunk(500, function (ElasticquentCollection $chunk) {
    $chunk->addToIndex();
});

andrewtweber avatar Aug 05 '19 12:08 andrewtweber

@andrewtweber Although it hasn't happened for us, I can definitely appreciate that being a better solution overall. In our case, we are actually using the addAllToIndex() function of the ElasticquentTrait, so I think the first step would be to add the chunking behaviour to that, and probably reindex() too.

I would be happy to proceed with that if you guys think that's a good change.

specialtactics avatar Aug 05 '19 12:08 specialtactics