solr-power
solr-power copied to clipboard
Support for indexing serialized arrays
Many plugins store data (e. g. repeatable text boxes) as serialized arrays in the post_meta table. Example:
a:4:{i:0;a:1:{s:11:"description";s:298:"Lorem ipsum dolor sit amet,...
Indexing those fields can lead to unwanted search results as control characters and array keys are included. One way to solve this would be using the "solr_build_document" filter to replace the fields in the solr document before sending it to the server.
The solution I propose would make this easier by automatically "flattening" serialized arrays. The related pull request can be found under #346.
The solution I propose would make this easier by automatically "flattening" serialized arrays.
What unexpected, undesirable edge cases could we encounter with this approach?
Also, it'd be worthwhile to see how the other search plugins handle this (ElasticPress and others).
What unexpected, undesirable edge cases could we encounter with this approach?
The original content is no longer indexed in the "*_s" field, so if anyone used the content from the Solr results for display purposes and relied on the exact content of that field, that would no longer work. The original content is however still stored in the "*_str" field, so that use case would still be possible.
Ok, I'm amenable to this change.
@ataylorme I'm going to move this out of 2.0.0
because I think it's worth spending more than 30 minutes on to get right, per conversation in https://github.com/pantheon-systems/solr-power/pull/346#issuecomment-350814150