Performance issues when indexing records that contain data from relationships
Summary
When searchableArray contains data from a relationship, the import will perform an extra query for each record in order to get it.
Scenario
A project belongsTo a customer. In search results, I want to display the Customer name under the Project name. The record might look like this:
{
id: 1001, // $this->id
name: 'Test Project', // $this->name
customer: {
'id': 2002, // $this->customer->id
'name': 'ABC Company', // $this->customer->name
}
}
Problem
When running scout:import, it would perform N+1 queries to get the Projects and then load the Customer for each. It was very slow and would not work on large data sets.
In reality, I have multiple indexes where the records contain data from 3 different relationships, so the problem is magnified.
My Solution
The solution that I ended up at was to create a custom console command for importing records. It would chunk() through the records in batches of 500, ->load() the relationships, then I used the Algolia PHP client library directly (not Scout) to push the records to Algolia.
One note: I'm aware of but don't use protected $with = ['customer']; on my Project model. I don't always want to load the Customer whenever I have a Project instance.
When I explained this to @nunomaduro he requested I open an issue here.
Thank you so much for reporting this issue. I will study this case soon as possible. Meanwhile I do have a question: Why have you said that would not work on large data sets.?
IIRC it was timing out because of the number of queries. I may have been able to just increase the timeout but I still didn't want to send that many queries or have it run for that long. With the eager/lazy loads, it's very fast.
Self note: Make sure I contact @fitztrev on discord.