doc mapper parse error on `otel_logs_v0_7`
Describe the bug
According to the log data spec, body field should support any type of data.
The doc mapping of the field is specified as json, but it seems to only accept JSON objects, not JSON arrays.
quickwit-1 | 2024-08-26T08:20:37.205Z WARN quickwit_indexing::actors::doc_processor: doc mapper parse error: the document contains an array of values but a single value is expected: "body" index_id="otel-logs-v0_7" source_id="_ingest-api-source"
So log records generated by OTEL's official SDK are not indexed properly.
I used logtape, which add body field as array.
Steps to reproduce (if applicable)
Full setup: https://github.com/cometkim/quickwit-tutorial
Expected behavior
Valid OTEL logs must be able to be indexed.
Maybe by supporting more JSON values, or by changing the field type dynamic.
Configuration:
- Output of
quickwit --version: 0.8.2 - The index_config.yaml: The built-in
otel_logs_v0_7
cc @trinity-1686a
Is there update on this? I don't expect this to be an easy fix in Quickwit. If there you have plan for this, I would like to follow in advance.
Looks https://github.com/quickwit-oss/tantivy/pull/2383 could fix this?
I think it should work yes, but there might be something tricky I don't have in mind.
The way I would go about it:
- Fix the doc mapping everywhere where LeafType::Json variant is processed
- Add tests for each type and see if something breaks elsewhere
The current way cardinality is defined would be a bit clumsy. array<json> would overlap functionally with json. I guess in that case we should ideally deprecate it.
Bonus:
- Make it configurable in the doc mapping wether the root should be an object or can also be an array or a primitive type?