OpenMLDB icon indicating copy to clipboard operation
OpenMLDB copied to clipboard

segment gc when ts_cnt > 1

Open vagetablechicken opened this issue 1 year ago • 2 comments

https://github.com/4paradigm/OpenMLDB/blob/21184d56251cd96088d787dfdb32527c84c78467/src/storage/segment.cc#L437-L458

ref https://utqcxc5xn1.feishu.cn/docx/FTbtdV25eoZDkjxODpCc44qhnlc , if we have a table with indexes in same keys but different ts, e.g.

CREATE TABLE talkingdata(
    ip int,app int,device int,os int,channel int,click_time timestamp,attributed_time timestamp,is_attributed int,
    index(key=(ip), ts=click_time, ttl=1s, ttl_type=absolute),
    index(key=(ip), ts=attributed_time),
    index(key=(app,os), ts=click_time)
);

index0 and index1 will in the same segment and ts_cnt_==2, so segment gc will trigger GcAllType, it'll use the wrong expire time.

when ts_cnt_<=1, ExecuteGc will calc expire time: https://github.com/4paradigm/OpenMLDB/blob/21184d56251cd96088d787dfdb32527c84c78467/src/storage/segment.cc#L398-L408

But GcAllType won't, it'll use a small time (ttl value, not the expire time, e.g. ttl=1m, time value will be 1970-01-01) to do gc. Normally, no row will be gc cuz row ts > small time, so the data never expire, you can check by show table status.

vagetablechicken avatar May 22 '24 03:05 vagetablechicken

Hello, I'd like to join the project by working on this. Could you assign me the issue? Thank you

Riccardo231 avatar Dec 26 '24 19:12 Riccardo231

Hello, I'd like to join the project by working on this. Could you assign me the issue? Thank you

@ricor07 thanks for being willing to contribute~

@Shouren @aceforeverd is this issue unassigned? Could we let @ricor07 take a look at this and work on a fix?

BTW, let me know if I can help.

vagetablechicken avatar Jan 02 '25 04:01 vagetablechicken