[dns]: test if dns lb actually works
Run dns lb on all supported languages and sdk versions to make sure that it works as intended.
Document findings in dns/README.md
To be done after https://github.com/scylladb/alternator-load-balancing/issues/14
It's easy to test that the DNS server is working by doing "dig" to it, for example dig @localhost somename.com, but as you noted it's more difficult to confirm that it "actually" works "as intended": Confirm that when the SDK uses this DNS through several layers - Amazon's SDK code, Python's HTTP, URL and socket libraries, and Linux's glibc and resolver, after going through all this it actually does what we hope it achives:
- that the low TTL is honored.
- how multiple A response-records are handled (as #14 adds)
- how multiple threads or processes on the same machine cache or don't cache the DNS reponse.
One of the specific things I want to verify in this issue is that a DNS that returns all nodes (such as dns-loadbalancer-rr.py added for #14) has any advantages over a simpler one that just returns one node (such as dns-loadbalancer.py). Specifically when DNS responses might be cached in one of many layers an SDK uses (name server, operating system, C library, high-level language library, HTTP library, AWS SDK, etc.), I want to see whether the one-node-returning DNS is more vulnerable to caching (where different connections, processes or even client machines, all use the same the same Scylla node) than a server returning the list of all live nodes.
Importantly, it's not OK to override the DNS used by a test by monkey-patching SDK code because we might monkey-patch the wrong code in the wrong layer. Rather, we need to force the SDK to use our DNS server using the established operating-system way to choose a DNS server for the entire test application. This is normally /etc/resolv.conf - but it will be really sad to need this test to mess around with the real /etc/resolv.conf. Maybe we need to run the test in a chroot jail or container or something, but perhaps a cleaner way is to use bind mount to shadow only the /etc/resolv.conf file and nothing else.
we are using docker --dns flag exactly like that in SCT:
https://github.com/scylladb/scylla-cluster-tests/blob/052e07b4b188e2998ed933d5f4c3faabb76a1b57/sdcm/ycsb_thread.py#L266
and the java cache disablement: https://github.com/scylladb/scylla-cluster-tests/blame/052e07b4b188e2998ed933d5f4c3faabb76a1b57/docker/ycsb/Dockerfile#L30