ext-solr
ext-solr copied to clipboard
[BUG] preAddModifyDocuments Hook is not executed in Page indexing context
Describe the bug preAddModifyDocuments is not executed in page indexing context
To Reproduce Steps to reproduce the behavior:
- use this hook in some ext_localconf.php
$GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['IndexQueueIndexer']['preAddModifyDocuments'][] = \ACO\AcoBase\Hooks\Solr\Index\TypeFieldToCustomTypeFieldCopier::class;
- use following Class
<?php
namespace ....\Hooks\Solr\Index;
use Apache_Solr_Document;
use ApacheSolrForTypo3\Solr\IndexQueue\Item;
use ApacheSolrForTypo3\Solr\IndexQueue\PageIndexerDocumentsModifier;
use TYPO3\CMS\Frontend\Controller\TypoScriptFrontendController;
/**
* Class TypeFieldToTypeStringSFieldCopier
*
* Copies "type" field to "type_stringS" field if "type_stringS" is empty.
*/
class TypeFieldToCustomTypeFieldCopier implements PageIndexerDocumentsModifier
{
const CUSTOM_TYPE_FIELD_NAME = 'type_stringS';
/**
* Modifies the given documents
*
* @param Item $item The currently being indexed item.
* @param int $language The language uid of the documents
* @param array $documents An array of documents to be indexed
* @return array An array of modified documents
*/
public function modifyDocuments(Item $item, $language, array $documents)
{
/* @var Apache_Solr_Document $document */
foreach ($documents as $document) {
if (is_array($document->getField(self::CUSTOM_TYPE_FIELD_NAME))) {
continue;
}
$originalTypeField = $document->getField('type');
$document->setField(self::CUSTOM_TYPE_FIELD_NAME, $originalTypeField['value']);
}
return $documents;
}
}
- Try to index some page and get
type_stringS
in pages documents on Solr
Expected behavior This hook must be executed in all indexing contexts and do the modifications on all document types.
Used versions (please complete the following information):
- TYPO3 Version: 8.7
- EXT:solr Version: [e.g. 8.1]
Additional context
There is another hook for page indexing context available($GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['Indexer']['indexPagePostProcessPageDocument']
), which can be used to do the same thing, but it does not prevent from confusing API.
@dkd-kaehm Can you analyze that further? By reading the code this hook should still be executed.
I think @dkd-kaehm is right, the hook preAddModifyDocuments
is called in Indexer->indexItem()
which isn't called by the PageIndexer
. The PageIndexer
is based on the Indexer
, but has an own implementation for page indexing in indexPage()
.
@dkd-kaehm @dkd-friedrich But this did not work in an early version either? I guess that was intended by the different keys of the Hooks "Indexer" and "IndexQueueIndexer". If you need that it would be nice if you could create a patch with a test
But this did not work in an early version either?
I'm not quite on the subject, but I think that's the case.
For me it looks like the indexPagePostProcessPageDocument hook was implemented supplementary, perhaps to solve an issue with the first hook :-). The name of the required interface (PageIndexerDocumentsModifier
) for preAddModifyDocuments
and the exception thrown could point into this direction:
if ($documentsModifier instanceof PageIndexerDocumentsModifier) {
$documents = $documentsModifier->modifyDocuments($item, $language, $documents);
} else {
throw new \RuntimeException(
'The class "' . get_class($documentsModifier)
. '" registered as document modifier in hook
preAddModifyDocuments must implement interface
ApacheSolrForTypo3\Solr\IndexQueue\PageIndexerDocumentsModifier',
1309522677
);
}
Perhaps we could stepwise reduce the complexity and finally merge this hooks
@dkd-kaehm: What do you think?
I would suggest to deprecate PageIndexerDocumentsModifier
and favor preAddModifyDocuments
as single thing which can be used to modify Solr Documents before they are send to Apache Solr. One hook must be enough.
An integration test should examine, that all implemented Indexer will call the preAddModifyDocuments
. Currently do not see Unit-Test solution for this case.
When implementing the hook preAddModifyDocuments for pages it would be nice to execute it after \ApacheSolrForTypo3\Solr\Typo3PageIndexer::processDocuments()
.
Background: Currently i'm unable to cleanup the rootline field and reduce its amount of values.
Same issue here with hook 'preAddModifyDocuments' on Typo3 10.4.19 and Solr 11.0.3.
$GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['solr']['IndexQueueIndexer']['preAddModifyDocuments'][]
= \Vendor\Ext\Hooks\Solr\ReferenceIndexModifier::class;
My class is called when solr is indexing records but not pages.