quickwit icon indicating copy to clipboard operation
quickwit copied to clipboard

Fix the ingest api source

Open fmassot opened this issue 3 years ago • 2 comments

Quickwit create automatically an ingest-api source to launch an indexing pipeline and be able to receive requests on our ingest API.

This raises at least 2 issues and raises some questions/comments.

  1. It is currently possible to create an inconsistent (or inconsistent) Ingest API source Just run cargo r source create --index hdfs-logs --source-config ../ingest_source_config.yaml --config ../config/quickwit.yaml

with

source_id: ingest-api
source_type: ingest-api
params:
  batch_num_bytes_limit: 200000
  queues_dir_path: ./qwdata/queues
  index_id: non-existing-index-id
  1. A user can create multiple ingest source on the same index

We end up creating several pipelines that not be used because it will not be linked to the IngestApiService. Note that Quickwit creates automatically an ingest API pipeline for each index.

  1. It should be possible to disable the ingest API source.
  2. It should be possible to configure the ingest API source.

Proposition

Using the ingest API just as a regular source seems to be the right way to do it. Here the things I have to clean up the situation:

  • in order to have a ingest API, the user should add a ingest-api source.
  • we should authorize only one ingest-api source per index
  • we should not let the user define an index-id or queues_dir_path in the source params
  • we could enforce the name source_id as .ingest-api for this type of source as we should only have a unique source of this type per index.
  • to remain user friendly, we can automatically add this type of source when the user create an index. If the user knows what he's doing, we can provider in the CLI an argument like --no-ingest-api that will disable the creation of the ingest api.

fmassot avatar Sep 18 '22 15:09 fmassot

in order to have a ingest API, the user should add a ingest-api source.

That seems cumbersome, what about adding an enable_ingest_api parameter in the config somewhere and Quickwit handles everything else?

I agree otherwise, there should be at most one ingest source API .ingest-api per index managed by Quickwit.

guilload avatar Sep 19 '22 13:09 guilload

Chatted off GitHub with @fmassot:

  • add enable parameter to source params
  • create one ingest API source per index by default in metastore
  • add defensive code to ensure ingest API source is always unique (type and id) per index
  • add CLI command source enable/disable --source <SOURCE ID>

guilload avatar Sep 19 '22 14:09 guilload