quickwit
quickwit copied to clipboard
Can not create a text field with "index" is false while "stored" is true
Describe the bug Can not create a text field with "index" is false while "stored" is true.
Steps to reproduce (if applicable) Steps to reproduce the behavior:
- When I want to create index like this:
doc_mapping:
field_mappings:
- name: data_binary
type: text
indexed: false
...
I only want to use text field to store a value without tokenizer while not index this field.
2. Quickwit return an error, says: "doc_mapping.field_mappings: Error while parsing field data_binary
: record
, tokenizer
, and fieldnorms
parameters are allowed only if indexed is true. at line 11 column 5"
After I has reviewed quickwit's code, I find the problem may occured in the serde process,the QuickwitTextOptions's serde config is like this:
pub struct QuickwitTextOptions {
#[serde(default)]
#[serde(skip_serializing_if = "Option::is_none")]
pub description: Option<String>,
#[serde(default = "default_as_true")]
pub indexed: bool,
#[serde(default)]
#[serde(skip_serializing_if = "Option::is_none")]
pub tokenizer: Option<QuickwitTextTokenizer>,
#[serde(default)]
pub record: IndexRecordOption,
#[serde(default = "default")]
pub fieldnorms: bool,
#[serde(default = "default_as_true")]
pub stored: bool,
#[serde(default)]
pub fast: bool,
}
If I don't config record or fieldnorms, there default values are IndexRecordOption::Basic and false, so this logic forbids me to create a not indexed but can stored field.
Expected behavior Can create a text not indexed field.
Configuration:
- Output of
quickwit --version
0.3.1
- The index_config.yaml
#
# Index config file for hdfs-logs dataset.
#
version: 0
index_id: segment_new
doc_mapping:
field_mappings:
- name: data_binary
type: text
indexed: false
- name: end_time
type: i64
indexed: true
fast: true
- name: endpoint_id
type: text
tokenizer: raw
indexed: true
fast: false
- name: endpoint_name_match
type: text
indexed: true
tokenizer: default
fast: false
- name: endpoint_name
type: text
indexed: true
tokenizer: raw
fast: false
- name: is_error
type: i64
indexed: true
#tokenizer: raw
fast: false
- name: latency
type: i64
indexed: true
#tokenizer: raw
fast: true
- name: segment_id
type: text
indexed: true
tokenizer: raw
fast: false
- name: service_id
type: text
indexed: true
tokenizer: raw
fast: false
- name: service_instance_id
type: text
indexed: true
tokenizer: raw
fast: false
- name: start_time
type: i64
indexed: true
#tokenizer: raw
fast: true
- name: statement
type: text
indexed: true
tokenizer: raw
fast: false
- name: tags
type: array<text>
indexed: true
tokenizer: raw
fast: false
stored: false
- name: time_bucket
type: i64
indexed: true
#tokenizer: raw
fast: true
- name: trace_id
type: text
indexed: true
tokenizer: raw
fast: false
- name: version
type: i64
indexed: true
#tokenizer: raw
fast: false
tag_fields: []
store_source: true
indexing_settings:
timestamp_field: start_time
search_settings:
default_search_fields: []
@geek-frio did you try such a config?
doc_mapping:
field_mappings:
- name: data_binary
type: text
indexed: false
stored: true
With such a config, it should be stored but not indexed, you can see the list of parameters in the docs.
Note that in 0.3.1, fieldnorms
is not present.
- Quickwit return an error, says: "doc_mapping.field_mappings: Error while parsing field data_binary: record, tokenizer, and fieldnorms parameters are allowed only if indexed is true. at line 11 column 5"
This error is normal. record
, tokenizer
, fieldnorms
make only sense if the field is indexed
.
@fmassot I have test add "stored: true" config, quickwit return this error message: “doc_mapping.field_mappings: Error while parsing field data_binary
: record
, tokenizer
, and fieldnorms
parameters are allowed only if indexed is true. at line 11 column 5”
Ok, sorry, that's probably a bug, let me reproduce that and push a fix.
I'm also hitting this bug and was able to reproduce. @nigel-andrews, can a push a fix this week?
Currently on it.
@geek-frio we have merged PR #2075, and it should be fixed your issue, closing this one for now. Thanks for your feedback.