scout-extended icon indicating copy to clipboard operation
scout-extended copied to clipboard

Performance issues when indexing records that contain data from relationships

Open fitztrev opened this issue 7 years ago • 4 comments

Summary

When searchableArray contains data from a relationship, the import will perform an extra query for each record in order to get it.

Scenario

A project belongsTo a customer. In search results, I want to display the Customer name under the Project name. The record might look like this:

{
    id: 1001,                    // $this->id
    name: 'Test Project',        // $this->name
    customer: {
        'id': 2002,              // $this->customer->id
        'name': 'ABC Company',   // $this->customer->name
    }
}

Problem

When running scout:import, it would perform N+1 queries to get the Projects and then load the Customer for each. It was very slow and would not work on large data sets.

In reality, I have multiple indexes where the records contain data from 3 different relationships, so the problem is magnified.

My Solution

The solution that I ended up at was to create a custom console command for importing records. It would chunk() through the records in batches of 500, ->load() the relationships, then I used the Algolia PHP client library directly (not Scout) to push the records to Algolia.

One note: I'm aware of but don't use protected $with = ['customer']; on my Project model. I don't always want to load the Customer whenever I have a Project instance.

When I explained this to @nunomaduro he requested I open an issue here.

fitztrev avatar Nov 30 '18 14:11 fitztrev

Thank you so much for reporting this issue. I will study this case soon as possible. Meanwhile I do have a question: Why have you said that would not work on large data sets.?

nunomaduro avatar Nov 30 '18 15:11 nunomaduro

IIRC it was timing out because of the number of queries. I may have been able to just increase the timeout but I still didn't want to send that many queries or have it run for that long. With the eager/lazy loads, it's very fast.

fitztrev avatar Nov 30 '18 15:11 fitztrev

Self note: Make sure I contact @fitztrev on discord.

nunomaduro avatar Dec 10 '18 15:12 nunomaduro