Fix the ingest api source
Quickwit create automatically an ingest-api source to launch an indexing pipeline and be able to receive requests on our ingest API.
This raises at least 2 issues and raises some questions/comments.
- It is currently possible to create an inconsistent (or inconsistent) Ingest API source
Just run
cargo r source create --index hdfs-logs --source-config ../ingest_source_config.yaml --config ../config/quickwit.yaml
with
source_id: ingest-api
source_type: ingest-api
params:
batch_num_bytes_limit: 200000
queues_dir_path: ./qwdata/queues
index_id: non-existing-index-id
- A user can create multiple ingest source on the same index
We end up creating several pipelines that not be used because it will not be linked to the IngestApiService.
Note that Quickwit creates automatically an ingest API pipeline for each index.
- It should be possible to disable the ingest API source.
- It should be possible to configure the ingest API source.
Proposition
Using the ingest API just as a regular source seems to be the right way to do it. Here the things I have to clean up the situation:
- in order to have a ingest API, the user should add a
ingest-apisource. - we should authorize only one ingest-api source per index
- we should not let the user define an index-id or queues_dir_path in the source params
- we could enforce the name
source_idas.ingest-apifor this type of source as we should only have a unique source of this type per index. - to remain user friendly, we can automatically add this type of source when the user create an index. If the user knows what he's doing, we can provider in the CLI an argument like
--no-ingest-apithat will disable the creation of the ingest api.
in order to have a ingest API, the user should add a ingest-api source.
That seems cumbersome, what about adding an enable_ingest_api parameter in the config somewhere and Quickwit handles everything else?
I agree otherwise, there should be at most one ingest source API .ingest-api per index managed by Quickwit.
Chatted off GitHub with @fmassot:
- add
enableparameter to source params - create one ingest API source per index by default in metastore
- add defensive code to ensure ingest API source is always unique (type and id) per index
- add CLI command
source enable/disable --source <SOURCE ID>