Add analyzer_params config for milvus vectordb

Open rainsoft opened this issue 9 months ago • 0 comments

Self Checks

[x] I have searched for existing issues search for existing issues, including closed ones.
[x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[x] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:)
[x] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

When performing a full-text search of Chinese texts, we need to configure the analyzer_params. However, the parameters are currently missing. For example:

# Define tokenizer parameters
analyzer_params = {
    "type": "chinese"  # Specify the tokenizer type as Chinese
}

# Add a text field to the Schema and enable the tokenizer
schema.add_field(
    field_name="text",                      # Field name
    datatype=DataType.VARCHAR,              # Data type: string (VARCHAR)
    max_length=65535,                       # Maximum length: 65,535 characters
    enable_analyzer=True,                   # Enable the tokenizer
    analyzer_params=analyzer_params         # Tokenizer parameters
)

2. Additional context or comments

No response

3. Can you help us with this feature?

[x] I am interested in contributing to this feature.

Mar 26 '25 11:03 rainsoft