lucene
lucene copied to clipboard
Update EdgeNGramTokenizer.DEFAULT_MAX_NGRAM_SIZE to be practical
issue : https://github.com/apache/lucene/issues/13802
- Many libraries(git code: Elasticsearch, OpenSearch) based on Lucene use NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE(=
2
) instead of EdgeNGramTokenizer's(=1
) when configuring an EdgeNGramTokenizer. - By the above reason, it's NOT practical to keep sticking DEFAULT_MAX_NGRAM_SIZE of EdgeNGramTokenizer to be
1
so this PR changes it to be2
. - If it's necessary to explain this change, I'll add/change explanations by documentation.