DIRAC icon indicating copy to clipboard operation
DIRAC copied to clipboard

[8.0] Elastic: index creation only at indexing time

Open fstagni opened this issue 1 year ago • 6 comments

BEGINRELEASENOTES

*Monitoring FIX: Elastic: index creation will only be done at indexing time

ENDRELEASENOTES

fstagni avatar Feb 08 '24 15:02 fstagni

What is meant by "at indexing time" here?

Is the idea to switch to index templates? If so, the wiki should be updated as I don't see any automatic creation of the template?

chrisburr avatar Mar 18 '24 13:03 chrisburr

The goal is to not creates indices with 0 documents. There is indeed no automatic creation of the template, which would be needed for new installations IIUC. This is something for the documentation rather than for the wiki.

fstagni avatar Mar 18 '24 14:03 fstagni

Won't creating the template also be needed for existing installations as well?

chrisburr avatar Mar 18 '24 21:03 chrisburr

I am not sure, TBH. We can try in our test setups

fstagni avatar Mar 19 '24 10:03 fstagni

I am not sure, TBH. We can try in our test setups

Yesterday I have applied this patch to the DIRAC certification setup, and simply restarted everything. Today I see some new indices created (with today's dates) even for indices that do not have a template.

fstagni avatar Mar 20 '24 09:03 fstagni

I would expect the "schema" (or whatever it's called) of the index to be incorrect?

chrisburr avatar Mar 20 '24 14:03 chrisburr

Index templates are not strictly necessary, even if they some advantages. I have created https://github.com/DIRACGrid/DIRAC/pull/7652 for that, have a look

fstagni avatar Jun 06 '24 14:06 fstagni

I just had a chat with one of the opensearch guys at CERN (@spapadop). And he mentioned that for data that rolls around, instead of indices, data streams would be the way to go. https://opensearch.docs.cern.ch/data_ingestion/#data-streams

The example was the "dataoperation" that I intend to only keep around for a short time, like a month. And as far as I understand the datastream would just let one keep one "file" and opensearch would prune old data after the expiration time is reached.

andresailer avatar Jun 24 '24 15:06 andresailer

Yes, using data streams is the plan also for us (I also mentioned at the workshop) but in order to use them index templates first have to be in place.

fstagni avatar Jun 25 '24 15:06 fstagni

Included in #7678

fstagni avatar Aug 13 '24 09:08 fstagni