daos
daos copied to clipboard
DAOS-11796 control: Consistently resolve MS replica addresses
In environments where access_points hostnames can resolve to multiple IP addresses in a nondeterministic manner, we can run into problems due to MS peers not recognizing each other. This patch works around the problem by pinning each replica to the lowest IP address in the set of addresses associated with each replica's hostname.
Signed-off-by: Michael MacDonald [email protected]
Bug-tracker data: Ticket title is 'daos container create failed: "DER_NONEXIST(-1005): The specified entity does not exist"' Status is 'In Review' Labels: 'HPE_dep,tds,triaged' https://daosio.atlassian.net/browse/DAOS-11796
Test stage NLT completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10535/1/execution/node/831/log
Is there a master version of this PR, too? Or did that already land?
Is there a master version of this PR, too? Or did that already land?
No, not yet. I was initially thinking that this was just a workaround for the 2.2.x series, and I wanted to get feedback on the approach from Aurora testing. I may go ahead and land it on master while I consider a more intrusive change for the 2.4+ series. I haven't yet decided whether or not it's an issue that the raft stuff only knows about a single IP address for a given hostname. Maybe it's actually fine?
I haven't yet decided whether or not it's an issue that the raft stuff only knows about a single IP address for a given hostname. Maybe it's actually fine?
When I was looking into similar issues previously, I had gotten the idea into my head that we'd probably want raft to know about all the IPs associated with a hostname so we could match against any of them. But with this solution I don't think that would be necessary.