opensearch-java icon indicating copy to clipboard operation
opensearch-java copied to clipboard

Porter Stem Filter not working in some cases

Open nileshshroff opened this issue 1 year ago • 2 comments

What is the bug?

In some cases the stemmed search is not returning any results.

In case of the following text which is indexed using porter_stem "price increases daily for the last month or so and this is not even the breaking news"

Searching for "price increas daili" returns 0 hits Searching for "break new" returns 1 hit

The following steps show how it can be reproduced.

How can one reproduce the bug?

Create a index with a field which uses a porter_stem filter

PUT /index_with_stemissue { "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 1 }, "analysis": { "analyzer": { "stemanalyzer": { "type": "custom", "tokenizer": "letter", "filter": [ "lowercase", "porter_stem" ] } } } }, "mappings": { "properties": { "message_text": { "type": "text", "analyzer": "stemanalyzer", "store": true } } } }

Test the field with analyze to see the stem words

GET /index_with_stemissue/_analyze { "field": "message_text", "text": "price increases daily for the last month or so and this is not even the breaking news" }

Add the document to the index

POST /index_with_stemissue/_doc/ { "message_text" : "price increases daily for the last month or so and this is not even the breaking news" }

This search using query string where stem words are included not work NO RESULTS RETURNED

GET index_with_stemissue/_search { "query": { "bool" : { "must" : [ { "query_string": { "query" : "message_text:\"price increas daili\"" } } ] } } }

However, another eg stem word works fine

GET index_with_stemissue/_search { "query": { "bool" : { "must" : [ { "query_string": { "query" : "message_text:\"break new\"" } } ] } } }

What is the expected behavior?

When searching for "price increas daili" 1 results should be returned

What is your host/environment?

AWS Service OpenSearch version 1.1

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

Do you have any additional context?

Add any other context about the problem.

nileshshroff avatar Apr 20 '24 14:04 nileshshroff

Is this due to an issue with the opensearch-java client, or should this issue be moved to https://github.com/opensearch-project/OpenSearch

wbeckler avatar May 06 '24 16:05 wbeckler

@nileshshroff Can you reproduce this with curl? Which version of OpenSearch/client? Otherwise can you please post your java code.

dblock avatar May 07 '24 06:05 dblock