Andrew Kryczka

Results 126 comments of Andrew Kryczka

Thanks for the info. Does TiKV also use an external syncer thread, that is, setting `manual_wal_flush==true` + a loop of `FlushWAL(true /* sync */)`? Did your LOG file contain any...

This is good info. I think 359 missing records should mean the corruption was near the tail. It might be the number of records between the last `FlushWAL(true /* sync...

Hi, yes, it can. When next WAL sync is requested, though, we fsync in order from oldest unsynced WAL up to the active WAL

Yes you are right. As you described it, there is a valid scenario where point-in-time recovery encounters "Corruption: checksum mismatch" and does not proceed to recover the most recent log....

Replied there - feel free to follow up there if I missed something

> some WAL files' size in manifest does not match the actual one---the actual size is bigger than the one in manifest. Do you have an example error message? I...

The short answer is, no, what you're checking is not guaranteed. Longer answer: the synced size recorded in the MANIFEST is just a lower bound on the latest synced offset....

Yes I agree the MANIFEST synced size should eventually match the actual synced size (with that fix). > What do you mean by special casing for inactive WALs Currently this...

There isn't a specific error message for missing records because we don't have a way to detect that in general. (Extra detail about how we're thinking about doing it) We...

> There isn't a specific error message for missing records because we don't have a way to detect that in general There are some ways that can catch recovery not...