landrush icon indicating copy to clipboard operation
landrush copied to clipboard

Upstream Resolution Broken in Non-Routed network - Virtualbox on OSX & Ubuntu

Open joelio opened this issue 10 years ago • 11 comments

Hello, we are having issues with upstream resolution in a private network.

I have explicitily set the upstream DNS to our internal, contactable server yet it does not seem to take effect. I can't resolve any external DNS

If I put the system on a routable internet connection then I get the correct resolution for both internal and external addresses.

I have a feeling that the upstream is not being set correctly and is reverting to this part of code

:upstream_servers => [[:udp, '8.8.8.8', 53], [:tcp, '8.8.8.8', 53]],

Which we can't contact those Google DNS servers internally of course...

Vagrant 1.6.3 landrush (0.14.1)

For completeness sake I've tested against out internal 14.04 images and the vanilla ubuntu provided ones.

vagrant@alpha:~$ cat /etc/resolv.conf 
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 10.0.2.3
vagrant@alpha:~$ host google.com
;; connection timed out; no servers could be reached
vagrant@alpha:~$ host alpha.vagrant.dev 
alpha.vagrant.dev has address 10.0.199.10
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached

joelio avatar Aug 12 '14 11:08 joelio

I can confirm that monkeypatching the ~/.vagrant.d/gems/gems/landrush-0.14.1/lib/landrush/config.rb with out internal DNS servers makes it work fine

joelio avatar Aug 12 '14 12:08 joelio

I have tried various permutations of

config.landrush.upstream '{internal dns ip}', 53, :udp

config.landrush.upstream '{internal dns ip}', 53, tcp

config.landrush.upstream '{internal dns ip}'

etc..

I'm wondering if the UDP and or TCP defaults are taking preference over my explicitly set upstreams (btw leveraging the actual hosts set DNS resolvers would be a superior solution - as this would be generally consistent in different environments)

joelio avatar Aug 12 '14 12:08 joelio

I can't see where the upstream config is actually instantiated, it looks like the DEFAULT block is what's called from finalize! Is this config actually still relevant? Seems a bit broken to me...

joelio avatar Aug 12 '14 12:08 joelio

Thanks for the detailed reporting! And sorry for the trouble with the config... it definitely seems broken from your investigation.

I'll dig into this and see if I can reproduce.

phinze avatar Aug 13 '14 14:08 phinze

No problem, give me a poke if you need any testing doing my side. Could be slight subtleties in DNS topology but I'm pretty certain it's just falling back to Goog's DNS server

joelio avatar Aug 13 '14 15:08 joelio

So I'm able to set OpenDNS servers using the config option like so:

Vagrant.configure("2") do |config|
  config.vm.box = "hashicorp/precise64"
  config.vm.hostname = "issue78.vagrant.dev"

  config.landrush.enabled = true
  config.landrush.upstream '208.67.222.222'
end

(The various multi-arg forms of calling upstream seem to work fine as well.)

> ssh issue78.vagrant.dev
[issue78] > curl internetbadguys.com
# yields openDNS fishing protection page, proving openDNS is working

One important (and non-obvious) thing to note is that landrush only runs one daemon for all VMs, and that daemon is not very smart about the upstream config changes. So I'd recommend running a vagrant landrush restart to get the Landrush DNS server running on the host to pick up the new upstream config.

phinze avatar Aug 13 '14 15:08 phinze

Ok, I will. I'm using this in a multi-vm environment. I've got the upstream set within the top (global?) level of the do config block, so afaik that should be being set for all VM's if it's a one-daemon-to-rule-them-all thing.

I've tried judicious restarts too to no avail unfortuntely. Thanks so much for confirming that the upstream should work though, it gives me something to go at...

joelio avatar Aug 13 '14 15:08 joelio

Yes, that config works, so it has to be a permutation of the multi-VM and private network additions I've got in my manifest. Thanks again, I'll report back when I find when the incompatibility is.

joelio avatar Aug 13 '14 16:08 joelio

Seems to be working great with a good bunch of restart, I owe you a beverage!

joelio avatar Aug 13 '14 16:08 joelio

Hmm, this seems to be rearing it's head again on the linux clients. I've purge landrush, removed config, restarted etc... etc.. etc...

No changes to the working Vagrantfile (fine on OSX)

Still not accepting the upstream. If I patch them it works.

build@proxy:~$ cat /etc/resolv.conf 
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 10.0.2.3'
build@proxy:~$ host google.com
Host google.com not found: 2(SERVFAIL)

I'm also a bit perplexed as to why the restarts are needed before as the process is killed anyway when the boxes are halted.

Is there any further diagnososis I can do, causing a blocker to us to be able to roll this out to our devs.

joelio avatar Aug 27 '14 16:08 joelio

Here's my Vagrantfile for completion... https://gist.github.com/joelio/85a3dbb5eb9f337de296

joelio avatar Aug 27 '14 17:08 joelio