Marc Handalian comments

Results 41 comments of


                                            Marc Handalian

[Segment replication] Remove unnecessary fsync calls

@Poojita-Raj So the failed tests in SegmentReplicationIndexShardTests are because we are using LuceneTestCase's `newFSDirectory`, which on close runs a [checkIndex](https://github.com/apache/lucene/blob/main/lucene/test-framework/src/java/org/apache/lucene/tests/store/MockDirectoryWrapper.java#L909) that will run various tests to identify corruption. We aren't...

[Segment replication] Remove unnecessary fsync calls

I think we can actually look at disabling fsyncs entirely on replicas if we are triggering a commit/fsync when the shard closes. That would make this implementation much simpler by...

[Segment Replication] Peer recovery checkpoint publication invariants

@Rishikesh1159 I don't think you need to change the visibility of the replication tracker. `SegmentReplicationTargetService` invokes `IndexShard#shouldProcessCheckpoint`, you can add a check there?

Feature flags should be defined in opensearch.yml

@tan31989 Thanks for your interest in this issue. > Hi, > > I am giving my thoughts on this issue. Ideally, feature flags should be allowed to toggle, but with...

[Segment Replication] Primary promotion on shard failing during node removal in RoutingNodes#failShard

I think 2 is the best option given we want this as a best effort. I also wouldn't be worried about it delaying shard promotion right now, we can set...

[Segment Replication] Review Cross Cluster Replication compatibility with Segment Replication Enabled

> I think the issue is that when we stop the replication for CCR, we close and reopen the index to reload the Engine. The index close is being handled...

Support shard promotion with Segment Replication.

I've added a commit to this ensuring cancelling primary allocation succeeds and that the replica is promoted & primary recreated as a replica. In testing that I found we were...

[Segment Replication BUG] Replica shard fails during segment replication during indexing / bulk indexing calls

@dreamer-89 Yeah this should not be failing the replica, it would catch up to the new cp after the current replication event completes. I think this is happening bc we...

[Segment Replication BUG] Replica shard fails during segment replication during indexing / bulk indexing calls

Have opened #4182 to cover moving this to allocationID over node. I have not been able to repro after applying this change but I think we should leave this open...

[Segment Replication BUG] Replica shard fails during segment replication during indexing / bulk indexing calls

closing this one bc I haven't seen it since, please reopen if needed.