lucene
lucene copied to clipboard
FVH BaseFragmentsBuilder does not properly support colored pre/post tags
Description
Given the BaseFragmentsBuilder
description:
...
/**
* Base FragmentsBuilder implementation that supports colored pre/post tags and multivalued fields.
*
* <p>Uses {@link BoundaryScanner} to determine fragments.
*/
public abstract class BaseFragmentsBuilder implements FragmentsBuilder {
...
We assume that if we input a query and an array of pre and post tags, they will follow the same order, like:
Query | Pre tag | Post tag |
---|---|---|
A B | <ab> |
</ab> |
C B | <cb> |
</cb> |
C A | <ca> |
</ca> |
It will not tag in a ordered way as the current BaseFragmentsBuilder
implementation gets tags in a almost random order:
protected String getPreTag(String[] preTags, int num) {
int n = num % preTags.length;
return preTags[n];
}
This is links back to this issue.
I already done some initial work to solve a problem where I work, but I would like to have a proper solution for Lucene.
The root cause is in the FieldQuery
flatten
, saveTerms
and expand
methods. They do need to exist but they also mess the order of pre/post tags. The termOrPhraseNumber
is used to get the preTag, and should follow the order of the queries.
I will try to add a unit test that properly illustrates this problem as it is kinda complex.
Version and environment details
Lucene 3.0+
Any environment