weave
weave copied to clipboard
dns-remove fails if container is not running
This is not a problem anybody should ever run into. If weave fails to remove entries for dead containers, we have a problem of much greater magnitude.
But there's also no reason for a remove operation to not be idempotent. Either it's there and it removes the entry, or it isn't there and there's nothing to do.
But there's also no reason for a remove operation to not be idempotent.
Sure. I didn't close this issue ;)
Are all the other remove variants idempotent?
Noting that this came up in the context that WeaveDNS did hold (through some other issue) entries for a dead container, so it was frustrating that dns-remove refused to remove them.
We do not know whether the container in question was gone completely or just not running.
Are we saying that weave dns-remove [<ip_address> ...] <container_id> [-h <fqdn>] should remove all DNS records pertaining to <container_id> when <container_id> isn't running or doesn't exist, regardless of the other options specified? Or should it only remove the records matching the args, even though no DNS records should be existing at all for a non-running/gone container?
And what should we do if the user specifies a container name but that name does not exist? Should we do nothing and a) succeed, or b) fail? What if it's a partial container id? In theory we could attempt to weed out DNS entries matching that.
Also, should the same logic apply to weave detach?
IMO this is an insane amount of fiddly work to put into coping with a situation that should not exist.
strawman:
- it should only remove the records matching the args, even though no DNS records should be existing at all for a non-running/gone container
- dns records should be matched by container id, short-id, or name.
re: weave detach probably, yes.
"should not" != "does not"
dns records should be matched by container id, short-id, or name.
We don't record container names in DNS.
"should not" != "does not"
I still think this is complete overkill. Don't forget that the user already has a way to clear any garbage: bounce the router. That will not only address this very particular problem but a range of other ills.
I got the same problem with my dns entries
dtk-0 192.166.0.1 64063d7ed392 3a:ca:39:55:88:70
es-data-1 192.164.179.54 cb54a9a230f8 1a:ef:98:09:64:c5
es-data-1 192.172.51.54 d7fc0e5da735 2e:6a:39:aa:a5:11
es-data-2 192.166.51.53 77d2ee8d4882 da:1f:2b:60:11:02
es-data-3 192.172.51.57 c39a3b3cba68 2e:6a:39:aa:a5:11
es-data-4 192.171.153.156 4a454e3348c9 7a:36:bb:b9:17:23
jupyter 192.171.179.58 b68fcdfed791 46:06:c5:de:a7:46
kafka-1 192.171.179.56 d2c2dda6494d 46:06:c5:de:a7:46
kafka-1 192.171.179.57 d2c2dda6494d 46:06:c5:de:a7:46
kafka-2 192.172.76.208 52a3bfe2e71b 7a:81:96:fb:3e:a1
kafka-2 192.172.76.209 52a3bfe2e71b 7a:81:96:fb:3e:a1
kafka-3 192.166.25.157 91b39285489d de:58:9b:22:2d:b4
kafka-3 192.166.25.158 91b39285489d de:58:9b:22:2d:b4
nosql-0 192.166.0.2 78580407e182 3a:ca:39:55:88:70
redis 192.171.153.157 4fb09654acea 7a:36:bb:b9:17:23
spark-worker-6 192.172.51.58 ca35af387291 2e:6a:39:aa:a5:11
traverser 192.171.153.158 1e00ca4189f4 7a:36:bb:b9:17:23
trifecta 192.166.0.3 c92069d9fcb5 3a:ca:39:55:88:70
zeppelin 192.164.179.56 564eb27826b2 1a:ef:98:09:64:c5
zoo-1 192.171.179.55 9ceb98df24e2 46:06:c5:de:a7:46
zoo-2 192.172.76.207 2e6b1b51ccac 7a:81:96:fb:3e:a1
zoo-3 192.165.230.104 993873d447c8 e6:1c:c0:92:67:5e
You can see that there are several containers with replicate names that do not exists. It would be useful to remove DNS entries without checking if the container is running. Probably it is possible to add a -f parameter to the command.
If you give to me a pointer to the class to check I will do a pull request.
Hi @fabiofumarola, it would be even better if we can get to the bottom of why you have stale entries and fix that.
Could you please run:
docker kill -s QUIT weave
docker logs weave
and post the result at https://github.com/weaveworks/weave/issues/2023
We think the problem that gave rise to this was fixed in 1.4.6 (via #2023).
I suggest this issue should make weave dns-remove idempotent in the same way as we made weave rmpeer idempotent, i.e. it should report on what it did. I suggest listing the name, IP and container, in the same way as weave status dns does (no need to list the peer since that is always our peer, since weave dns-remove only ever acts on our records).
I don't think we should be attempting to work around DNS breakages here.
Man, its suuuuuper disappointing that weave dns-remove does not do what it says it does. Please rename to weave remove-running-container-record so that people dont expect this tool to do what the function currently labeled as being able to do.
I cant believe Im about to have to flush routing for hundreds of services when all that needs to happen is the removal of a single DNS record.
This is the kind of pain that's made weave an unacceptable service discovery mechanism for our business needs.
Would you like to purchase business support?
Actually, yeah I kinda do. I'll reach out.