manticoresearch
manticoresearch copied to clipboard
Properly rotate index if using only RT indexes (ALTER RENAME)
I am having trouble finding a proper way to rotate data in RT index.
For the sake of speed, I am indexing data to RT index directly (sending concurrent queries and bulk inserts).
But, when I need to reindex the data from the scratch I can't just prepare a new table to create as "LIKE" the current table, fill it with the data, and replace the old table with the new one filled with new data.
What is the correct way to do this?
You are probably looking for smth like ALTER TABLE RENAME
(in mysql and postgresql)? Unfortunately it's not supported yet. If you elaborate more on why it's important in your case we can consider this a feature request.
I don't remember the exact details now, But pretty sure have had success using 'ATTACH' to move data from one RT index to the other.
Each 'shard' an RT index is effectively a 'plain' index. So just move each shard separately. (or can use OPTIMIZE
to make just one shard, probably just after using FLUSH RAMCHUNK
!)
I think had to create a 'fake' index for the shard.
ie if the 'source' RT index has a path
like
index reindex {
path = /var/data/rtindex
then can create a new 'fake' RT index for each shard
index rtindex_shard0 {
type = plain
path = /var/data/rtindex.0
Then this new fake index, is that is using the data form the single shard. Then can attach this
ATTACH INDEX rtindex_shard0 TO RTINDEX newindex;
(only working from memory here, so probably some more details. Including faffing around with restarting searchd, as it doesnt like converting the RT shard to a plain index while running)
Never tried it, but perhaps could just rename all the index files, to 'rename' an index. Would have to be done while searchd is shutdown, and not forgetting to update the config file!
ALTER TABLE RENAME is exactly what I am looking for. @sanikolaev well, a use case here is the same as the standard way of getting RT index: plain index indexation then attach to RT index, however, the limitation of this method is the speed of getting plain index ready which requires continuous processing of records from e.g. CSV file, where each record is processed line by line. When I am indexing the data I am creating RT index and push my records directly to it in the concurrent way 10 at a time in batch inserts, so, the indexation speed is much faster. But, it is a bit inconvenient as there is no way to rotate my old RT index with the prepared new one right after it becomes filled and ready. Hope it may convince you to implement this ALTER TABLE RENAME feature. Not sure, but from my point of view, it is not something extraordinary as RT indexes anyway are separated in the manticore storage.
@barryhunter as I remember I tried this attach option, however without FLUSHing, will try, many thanks.
OK, I have tested an approach with attaching index.
providing in my case I do not have any preconfigured settings in the manticore config.
mysql> CREATE TABLE a (a text);
mysql> CREATE TABLE b (a text);
mysql> FLUSH RAMCHUNK a;
mysql> FLUSH RAMCHUNK b;
mysql> ATTACH INDEX a TO RTINDEX b;
ERROR 1064 (42000): 1st argument to ATTACH must be a plain index
So, I have no options to do it with the ATTACH command.
can ALTER RENAME be implemented?
Yes, it can. But it's not in our nearest plans. If it's mission critical for you we provide professional services including custom feature requests development - https://manticoresearch.com/services
ok, we will will wait when you consider to add it
fixed at https://github.com/manticoresoftware/manticoresearch/commit/af5812cc0affbb74f31d6f9d4f0ea34f6ec64a0f. Now ATTACH supports the RT index along with the plain index as the source. After
ATTACH TABLE rt1 TO TABLE rt2
rt1
table becomes empty at the daemon and all data (disk chunks) got moved into rt2
table