FOSElasticaBundle icon indicating copy to clipboard operation
FOSElasticaBundle copied to clipboard

Ingest-Attachment integration

Open EofChris opened this issue 6 years ago • 1 comments

Greetings,

we are using this bundle with ES 5.6 and the ingest attachment plugin to get rid of the old mapper attachment integration. The pipeline workflow was quite new to me and we put a lot of effort into figuring things out.

As the ruflin/elastica dependency already supports setting a pipeline on the Document object, this didn't seem to work when rebuilding the index using the populate command. I guess this is related to the way this bundle wraps around the elastic library.

We are using the POST_TRANSFORM event to add a custom "file_data" property after setting up an attachment pipeline which takes care of processing the file data. Since the project we are working on has quite grown over time we are triggering runtime updates manually instead of using the event-based Doctrine approach. In the end populating the index from the CLI and updating it at runtime for single objects comes down to bulk requests.

Our first approach was testing around with the ruflin/elastic low level API and following the ingest test case which worked out really well. But just using the TransformEvent to get the document before it is being send to ES and modifying it did not. Setting addPipeline() on the Document does not seem to be recognized. While debugging through the code and following the stack trace we figured out that the query options get passed through a lot of different classes up to the ObjectPersister. Unfortunately, we could not find a way to pass any custom request options to it.

The only way we got things working was to subscribe to the Events::PRE_PERSIST event (which gets fired by the InPlacePagerPersister) in order to modify the options of the used ObjectPersister instance:

public static function getSubscribedEvents()
{
    return [
        TransformEvent::POST_TRANSFORM => 'addCustomProperty',
        Events::PRE_PERSIST => 'onPrePersist',
    ];
}

public function onPrePersist(PrePersistEvent $event)
{
    /** @var ObjectPersister $objectPersister */
    $objectPersister = $event->getObjectPersister();
    $objectPersister->setOptions('pipeline', 'attachment');
}

However, this would require to add a setOptions() method to ObjectPersister or its ObjectPersisterInterface.

If there's any better solution, we'd be grateful for any advice. If not, and if there are no objections, we could submit the addition of ObjectPersisterInterface->setOptions() as a pull request.

Thank you for any help you can provide.

EofChris avatar Jul 22 '19 16:07 EofChris

I also needed a way to set options on the ObjectPersister without making any changes to the bundle itself and ended up modifying the service definitions in a compiler pass to achieve this:

class ElasticaCompilerPass implements CompilerPassInterface
{
    public function process(ContainerBuilder $container): void
    {
        $definition = $container->getDefinition('fos_elastica.object_persister.{INDEX_NAME}.{TYPE_NAME}');
        $definition->setArgument('index_4', ['key' => 'value']);
    }
}

index_4 refers to the fourth argument of the service constructor, which is the options array.

jaw24 avatar Jan 14 '21 17:01 jaw24