quickwit icon indicating copy to clipboard operation
quickwit copied to clipboard

Can not create a text field with "index" is false while "stored" is true

Open geek-frio opened this issue 1 year ago • 6 comments

Describe the bug Can not create a text field with "index" is false while "stored" is true.

Steps to reproduce (if applicable) Steps to reproduce the behavior:

  1. When I want to create index like this:
doc_mapping:
  field_mappings:
    - name: data_binary
      type: text 
      indexed: false
 ...

I only want to use text field to store a value without tokenizer while not index this field. 2. Quickwit return an error, says: "doc_mapping.field_mappings: Error while parsing field data_binary: record, tokenizer, and fieldnorms parameters are allowed only if indexed is true. at line 11 column 5"

After I has reviewed quickwit's code, I find the problem may occured in the serde process,the QuickwitTextOptions's serde config is like this:

pub struct QuickwitTextOptions {
    #[serde(default)]
    #[serde(skip_serializing_if = "Option::is_none")]
    pub description: Option<String>,
    #[serde(default = "default_as_true")]
    pub indexed: bool,
    #[serde(default)]
    #[serde(skip_serializing_if = "Option::is_none")]
    pub tokenizer: Option<QuickwitTextTokenizer>,
    #[serde(default)]
    pub record: IndexRecordOption,
    #[serde(default = "default")]
    pub fieldnorms: bool,
    #[serde(default = "default_as_true")]
    pub stored: bool,
    #[serde(default)]
    pub fast: bool,
}

If I don't config record or fieldnorms, there default values are IndexRecordOption::Basic and false, so this logic forbids me to create a not indexed but can stored field.

Expected behavior Can create a text not indexed field.

Configuration:

  1. Output of quickwit --version

0.3.1

  1. The index_config.yaml
#
# Index config file for hdfs-logs dataset.
#

version: 0

index_id: segment_new

doc_mapping:
  field_mappings:
    - name: data_binary
      type: text 
      indexed: false
    - name: end_time
      type: i64
      indexed: true
      fast: true
    - name: endpoint_id
      type: text
      tokenizer: raw
      indexed: true
      fast: false
    - name: endpoint_name_match
      type: text
      indexed: true
      tokenizer: default
      fast: false
    - name: endpoint_name
      type: text
      indexed: true
      tokenizer: raw
      fast: false
    - name: is_error
      type: i64
      indexed: true
      #tokenizer: raw
      fast: false
    - name: latency
      type: i64
      indexed: true
      #tokenizer: raw
      fast: true
    - name: segment_id
      type: text
      indexed: true
      tokenizer: raw
      fast: false
    - name: service_id
      type: text
      indexed: true
      tokenizer: raw
      fast: false
    - name: service_instance_id
      type: text
      indexed: true
      tokenizer: raw
      fast: false
    - name: start_time
      type: i64
      indexed: true
     #tokenizer: raw
      fast: true
    - name: statement
      type: text
      indexed: true
      tokenizer: raw
      fast: false
    - name: tags
      type: array<text>
      indexed: true
      tokenizer: raw
      fast: false
      stored: false
    - name: time_bucket
      type: i64
      indexed: true
      #tokenizer: raw
      fast: true
    - name: trace_id
      type: text
      indexed: true
      tokenizer: raw
      fast: false
    - name: version
      type: i64
      indexed: true
      #tokenizer: raw
      fast: false
  tag_fields: []
  store_source: true

indexing_settings:
  timestamp_field: start_time

search_settings:
  default_search_fields: []

geek-frio avatar Sep 30 '22 06:09 geek-frio

@geek-frio did you try such a config?

doc_mapping:
  field_mappings:
    - name: data_binary
      type: text 
      indexed: false
      stored: true

With such a config, it should be stored but not indexed, you can see the list of parameters in the docs.

Note that in 0.3.1, fieldnorms is not present.

fmassot avatar Sep 30 '22 08:09 fmassot

  1. Quickwit return an error, says: "doc_mapping.field_mappings: Error while parsing field data_binary: record, tokenizer, and fieldnorms parameters are allowed only if indexed is true. at line 11 column 5"

This error is normal. record, tokenizer, fieldnorms make only sense if the field is indexed.

fmassot avatar Sep 30 '22 09:09 fmassot

图片 图片

@fmassot I have test add "stored: true" config, quickwit return this error message: “doc_mapping.field_mappings: Error while parsing field data_binary: record, tokenizer, and fieldnorms parameters are allowed only if indexed is true. at line 11 column 5”

geek-frio avatar Sep 30 '22 10:09 geek-frio

Ok, sorry, that's probably a bug, let me reproduce that and push a fix.

fmassot avatar Sep 30 '22 10:09 fmassot

I'm also hitting this bug and was able to reproduce. @nigel-andrews, can a push a fix this week?

guilload avatar Oct 05 '22 00:10 guilload

Currently on it.

nigel-andrews avatar Oct 06 '22 09:10 nigel-andrews

@geek-frio we have merged PR #2075, and it should be fixed your issue, closing this one for now. Thanks for your feedback.

fmassot avatar Oct 11 '22 09:10 fmassot