ext-solr icon indicating copy to clipboard operation
ext-solr copied to clipboard

[BUG] preAddModifyDocuments Hook is not executed in Page indexing context

Open dkd-kaehm opened this issue 5 years ago • 7 comments

Describe the bug preAddModifyDocuments is not executed in page indexing context

To Reproduce Steps to reproduce the behavior:

  1. use this hook in some ext_localconf.php $GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['IndexQueueIndexer']['preAddModifyDocuments'][] = \ACO\AcoBase\Hooks\Solr\Index\TypeFieldToCustomTypeFieldCopier::class;
  2. use following Class
<?php
namespace ....\Hooks\Solr\Index;

use Apache_Solr_Document;
use ApacheSolrForTypo3\Solr\IndexQueue\Item;
use ApacheSolrForTypo3\Solr\IndexQueue\PageIndexerDocumentsModifier;
use TYPO3\CMS\Frontend\Controller\TypoScriptFrontendController;

/**
* Class TypeFieldToTypeStringSFieldCopier
*
* Copies "type" field to "type_stringS" field if "type_stringS" is empty.
*/
class TypeFieldToCustomTypeFieldCopier implements PageIndexerDocumentsModifier
{
  const CUSTOM_TYPE_FIELD_NAME = 'type_stringS';

  /**
   * Modifies the given documents
   *
   * @param Item $item The currently being indexed item.
   * @param int $language The language uid of the documents
   * @param array $documents An array of documents to be indexed
   * @return array An array of modified documents
   */
  public function modifyDocuments(Item $item, $language, array $documents)
  {
      /* @var Apache_Solr_Document $document */
      foreach ($documents as $document) {
          if (is_array($document->getField(self::CUSTOM_TYPE_FIELD_NAME))) {
              continue;
          }
          $originalTypeField = $document->getField('type');
          $document->setField(self::CUSTOM_TYPE_FIELD_NAME, $originalTypeField['value']);
      }
      return $documents;
  }
}

  1. Try to index some page and get type_stringS in pages documents on Solr

Expected behavior This hook must be executed in all indexing contexts and do the modifications on all document types.

Used versions (please complete the following information):

  • TYPO3 Version: 8.7
  • EXT:solr Version: [e.g. 8.1]

Additional context There is another hook for page indexing context available($GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['Indexer']['indexPagePostProcessPageDocument']), which can be used to do the same thing, but it does not prevent from confusing API.

dkd-kaehm avatar Mar 08 '19 11:03 dkd-kaehm

@dkd-kaehm Can you analyze that further? By reading the code this hook should still be executed.

timohund avatar Mar 12 '19 08:03 timohund

I think @dkd-kaehm is right, the hook preAddModifyDocuments is called in Indexer->indexItem() which isn't called by the PageIndexer. The PageIndexer is based on the Indexer, but has an own implementation for page indexing in indexPage().

dkd-friedrich avatar Mar 12 '19 13:03 dkd-friedrich

@dkd-kaehm @dkd-friedrich But this did not work in an early version either? I guess that was intended by the different keys of the Hooks "Indexer" and "IndexQueueIndexer". If you need that it would be nice if you could create a patch with a test

timohund avatar Mar 12 '19 20:03 timohund

But this did not work in an early version either?

I'm not quite on the subject, but I think that's the case.

For me it looks like the indexPagePostProcessPageDocument hook was implemented supplementary, perhaps to solve an issue with the first hook :-). The name of the required interface (PageIndexerDocumentsModifier) for preAddModifyDocuments and the exception thrown could point into this direction:

if ($documentsModifier instanceof PageIndexerDocumentsModifier) {
    $documents = $documentsModifier->modifyDocuments($item, $language, $documents);
} else {
    throw new \RuntimeException(
        'The class "' . get_class($documentsModifier)
        . '" registered as document modifier in hook
			preAddModifyDocuments must implement interface
			ApacheSolrForTypo3\Solr\IndexQueue\PageIndexerDocumentsModifier',
        1309522677
    );
}

Perhaps we could stepwise reduce the complexity and finally merge this hooks

@dkd-kaehm: What do you think?

dkd-friedrich avatar Mar 13 '19 06:03 dkd-friedrich

I would suggest to deprecate PageIndexerDocumentsModifier and favor preAddModifyDocuments as single thing which can be used to modify Solr Documents before they are send to Apache Solr. One hook must be enough.

An integration test should examine, that all implemented Indexer will call the preAddModifyDocuments. Currently do not see Unit-Test solution for this case.

dkd-kaehm avatar Mar 18 '19 10:03 dkd-kaehm

When implementing the hook preAddModifyDocuments for pages it would be nice to execute it after \ApacheSolrForTypo3\Solr\Typo3PageIndexer::processDocuments().

Background: Currently i'm unable to cleanup the rootline field and reduce its amount of values.

christophlehmann avatar Oct 06 '21 12:10 christophlehmann

Same issue here with hook 'preAddModifyDocuments' on Typo3 10.4.19 and Solr 11.0.3.

$GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['IndexQueueIndexer']['preAddModifyDocuments'][]
    = \Vendor\Ext\Hooks\Solr\ReferenceIndexModifier::class;

My class is called when solr is indexing records but not pages.

vicluber avatar Oct 07 '21 17:10 vicluber