Marc Handalian

Results 71 comments of Marc Handalian
trafficstars

I'm able to see this consistently when running benchmarks, but its more difficult getting the primary into this state within an IT. I think at a minimum we should add...

Ok figured out a bit more of whats going on here, I think theres two things lacking. 1. This should not fail the replica - the replica should gracefully retry....

I'm thinking we could do something like this on the primary to return the on-disk infos if the generation is higher than whats in memory - ```java @Override public GatedCloseable...

> @mch2 : This bug seems to be interesting. Is memory `SegmentInfos` falling behind on-disk copy an expected state ? I think this is happening when the primary commits without...

@dreamer-89 I've drafted a PR that could solve this by only sending segments over the wire and ignoring commit points - https://github.com/opensearch-project/OpenSearch/pull/4366. I think this is ok because we send...

Hi @hydrogen666 thanks for the question! In the scenario you describe once the tlog recovery completes on A it would publish a new checkpoint notification to replicas. The published checkpoint...

Thanks all for your thoughts here. I like the idea of using both 2 and 3. I don't think implementing 3 will be all that hard because each replica already...

Reviving this issue as the first phase of segrep is nearly merged into main. Tracked with #2355. When that issue is complete we will have a basic implementation with primary...

Checklist for me as I go through this... - [x] commit SegmentInfos on the replica storing checkpoints. - [x] Write a test asserting the engine type is swapped during failover....

Brainstorming an implementation of this within our current segrep architecture. This is just high level, would need to POC this to see how we could refactor & make this fit....