manticoresearch icon indicating copy to clipboard operation
manticoresearch copied to clipboard

Properly rotate index if using only RT indexes (ALTER RENAME)

Open usatenko opened this issue 3 years ago • 23 comments

I am having trouble finding a proper way to rotate data in RT index.
For the sake of speed, I am indexing data to RT index directly (sending concurrent queries and bulk inserts).
But, when I need to reindex the data from the scratch I can't just prepare a new table to create as "LIKE" the current table, fill it with the data, and replace the old table with the new one filled with new data.

What is the correct way to do this?

usatenko avatar Apr 09 '21 10:04 usatenko

You are probably looking for smth like ALTER TABLE RENAME (in mysql and postgresql)? Unfortunately it's not supported yet. If you elaborate more on why it's important in your case we can consider this a feature request.

sanikolaev avatar Apr 09 '21 10:04 sanikolaev

I don't remember the exact details now, But pretty sure have had success using 'ATTACH' to move data from one RT index to the other.

Each 'shard' an RT index is effectively a 'plain' index. So just move each shard separately. (or can use OPTIMIZE to make just one shard, probably just after using FLUSH RAMCHUNK!)

I think had to create a 'fake' index for the shard.

ie if the 'source' RT index has a path like

 index reindex {
    path = /var/data/rtindex

then can create a new 'fake' RT index for each shard

 index rtindex_shard0 {
    type = plain
    path = /var/data/rtindex.0

Then this new fake index, is that is using the data form the single shard. Then can attach this

 ATTACH INDEX rtindex_shard0 TO RTINDEX newindex;

(only working from memory here, so probably some more details. Including faffing around with restarting searchd, as it doesnt like converting the RT shard to a plain index while running)

Never tried it, but perhaps could just rename all the index files, to 'rename' an index. Would have to be done while searchd is shutdown, and not forgetting to update the config file!

barryhunter avatar Apr 09 '21 11:04 barryhunter

ALTER TABLE RENAME is exactly what I am looking for. @sanikolaev well, a use case here is the same as the standard way of getting RT index: plain index indexation then attach to RT index, however, the limitation of this method is the speed of getting plain index ready which requires continuous processing of records from e.g. CSV file, where each record is processed line by line. When I am indexing the data I am creating RT index and push my records directly to it in the concurrent way 10 at a time in batch inserts, so, the indexation speed is much faster. But, it is a bit inconvenient as there is no way to rotate my old RT index with the prepared new one right after it becomes filled and ready. Hope it may convince you to implement this ALTER TABLE RENAME feature. Not sure, but from my point of view, it is not something extraordinary as RT indexes anyway are separated in the manticore storage.

usatenko avatar Apr 19 '21 01:04 usatenko

@barryhunter as I remember I tried this attach option, however without FLUSHing, will try, many thanks.

usatenko avatar Apr 19 '21 01:04 usatenko

OK, I have tested an approach with attaching index. providing in my case I do not have any preconfigured settings in the manticore config. mysql> CREATE TABLE a (a text); mysql> CREATE TABLE b (a text); mysql> FLUSH RAMCHUNK a; mysql> FLUSH RAMCHUNK b; mysql> ATTACH INDEX a TO RTINDEX b; ERROR 1064 (42000): 1st argument to ATTACH must be a plain index So, I have no options to do it with the ATTACH command.

usatenko avatar Jun 03 '21 06:06 usatenko

can ALTER RENAME be implemented?

usatenko avatar Jun 09 '21 09:06 usatenko

Yes, it can. But it's not in our nearest plans. If it's mission critical for you we provide professional services including custom feature requests development - https://manticoresearch.com/services

sanikolaev avatar Jun 09 '21 09:06 sanikolaev

ok, we will will wait when you consider to add it

usatenko avatar Jul 12 '21 17:07 usatenko

fixed at https://github.com/manticoresoftware/manticoresearch/commit/af5812cc0affbb74f31d6f9d4f0ea34f6ec64a0f. Now ATTACH supports the RT index along with the plain index as the source. After

ATTACH TABLE rt1 TO TABLE rt2

rt1 table becomes empty at the daemon and all data (disk chunks) got moved into rt2 table

tomatolog avatar Jan 11 '24 16:01 tomatolog