AdGuardHome icon indicating copy to clipboard operation
AdGuardHome copied to clipboard

Local DNS zones and cached responses aren't served after the network lost

Open EugeneOne1 opened this issue 3 years ago • 26 comments

Prerequisites

  • [X] I have checked the Wiki and Discussions and found no answer

  • [X] I have searched other issues and found no duplicates

  • [X] I want to report a bug and not ask a question

Operating system type

Linux, Other (please mention the version in the description)

CPU architecture

64-bit ARM

Installation

Docker

Setup

On one machine

AdGuard Home version

v0.107.9

Description

This is a continuation of the thread started in #2657. The problem's first occurance was in v0.104.3 and has already been fixed a couple of times but still reported. We can't reproduce the issue on our machines. If you've faced it, please consider providing the following information:

  • the setup details (the OS, CPU architecture, installation type);
  • the environment details (other DNS servers, DHCP server);
  • the "General settings", "Cache" and "Encryption" configuration parts (any other details on AdGuard Home's configuration are appreciated);
  • the verbose log with the network loss moment captured.

The last two pieces of information (optionally anonymized) could be sent to [email protected] with this issue's number in the subject.

EugeneOne1 avatar Aug 08 '22 14:08 EugeneOne1

Please, take a look at this, @handcoding, @conradseba, @abdalians, @s1lviu, @dinosoup1. I've mentioned you since you've reported the issue to the #2657. Could you please also help us with the investigation? Thanks.

EugeneOne1 avatar Aug 08 '22 14:08 EugeneOne1

Same issue here since ever. My setup is: Version: v0.108.0-b.11 Installed on PfSense 22.05, FreeBSD 12.3 (arm64) as a packet. I'm using DOH, my FW encapsulates all traffic through OpenVPN, no encryption facing internal networks enabled, no DHCP on the AdGuard and no IPv6.

I really hope this is solved soon, since I'm suffering from this many times a day everyday (my Vodafone provider is the worst I've ever had).

Thank you!!

conradseba avatar Aug 08 '22 14:08 conradseba

@EugeneOne1 we just need the debug logs, right?

abdalians avatar Aug 08 '22 21:08 abdalians

Same issue here since ever. My setup is: Version: v0.108.0-b.11 Installed on PfSense 22.05, FreeBSD 12.3 (arm64) as a packet. I'm using DOH, my FW encapsulates all traffic through OpenVPN, no encryption facing internal networks enabled, no DHCP on the AdGuard and no IPv6.

I really hope this is solved soon, since I'm suffering from this many times a day everyday (my Vodafone provider is the worst I've ever had).

Thank you!!

@conradseba if your wan drop frequency is that bad, could you please capture the logs as requested in the other ticket? Save me from taking down the network for log capture. :)

abdalians avatar Aug 08 '22 22:08 abdalians

@abdalians, that's right, we call it "verbose".

EugeneOne1 avatar Aug 09 '22 09:08 EugeneOne1

Apologies for the delay in this I am finally in this broken state again and I am trying to collect as much Information as I can will post shortly.

abdalians avatar Sep 02 '22 22:09 abdalians

adguard_logs_02Sep2022.tar.gz

To reiterate the point, this only happens when my primary internet (cable) fails over to secondary internet (dsl)

Please see investigation file attached.

  • adguard is running and listening on port 53 Resolution: turning off Adguard PArental Control Web Service / Adguard borwsing securiy web service makes the queries work again.

Until the time that the primary internet connection is restored, then enabling the Adguard PArental Control Web Service / Adguard borwsing securiy web services makes Adguard work again.

adguard_investigation.txt

abdalians avatar Sep 02 '22 22:09 abdalians

Please, take a look at this, @handcoding, @conradseba, @abdalians, @s1lviu, @dinosoup1. I've mentioned you since you've reported the issue to the #2657. Could you please also help us with the investigation? Thanks.

@EugeneOne1 I haven’t personally run into this issue since the fix for #4317 landed on the main trunk. (But that’s just me.)

handcoding avatar Sep 05 '22 19:09 handcoding

Aha! I have the same issue and I posted about it just now:

https://github.com/AdguardTeam/AdGuardHome/discussions/4969

What is the progress for this? My unifi network uses the FQDN of my unifi controller. When my Internet connection drops (it just did two days ago and it was out for 45 freaking hours!), I lose control over my local network because of AGH!

kevindd992002 avatar Sep 29 '22 15:09 kevindd992002

@EugeneOne1 do you need more information the ticket? still says needs investigation and needs to be reproduced reliably. I can reproduce this every single time without failure. Also the milestones were set to 107.16 which is out now.. does that mean we have a potential fix?

abdalians avatar Oct 09 '22 16:10 abdalians

Version: v0.107.16 still impacted by this.

abdalians avatar Oct 24 '22 00:10 abdalians

Version: v0.107.17 still impacted by this.

ve6rah avatar Dec 28 '22 06:12 ve6rah

Any updates on the matter? I stopped using it for now..

nonoMain avatar Jan 15 '23 00:01 nonoMain

@abdalians, hello again and apologies for late response. It actually seems AdGuard Home still serves local DNS zones, resolving the requests with appropriate local data, at least I can see some answered plain PTR requests for local addresses. All the other requests are indeed being dropped due to Safe Browsing services failure, even preventing those to be answered from cache. We have a feature request (#2857) about improving the implementation of the Safe Browsing / Parental Control services, but for now it terminates the request processing on failure.

Could you please check a few special cases:

  • Add a $dnsrewrite entry with some improbable domain name to your custom filtering rules, something like:

    ||not-a-real.domain^$dnsrewrite=NOERROR;A;1.2.3.4
    

    And after the network lost try to request it. Should be resolved properly regardless of the Safe Browsing services state;

  • Try to request some domain from the /etc/hosts file, they should be resolved as well.

AFAIK, AdGuard Home isn't responsible for any other local data in your setup (DHCP seems being disabled, and the only local resolver is loopback, so RDNS also has no additional info), so if the above is answered, the problem is Safe Browsing services reachability.

EugeneOne1 avatar Jan 25 '23 15:01 EugeneOne1

the problem is Safe Browsing services reachability.

I think I have to refute that, I don't use "safe browsing" on my setup, and yet, after my internet connection went down, I lost the ability to resolve local hosts. I'm talking specifically about hosts in the DNS rewrites section of my config.

I was quite surprised that running my own DNS I would lose the ability to resolve hosts on my own internal network!

ve6rah avatar Jan 26 '23 01:01 ve6rah

@ve6rah, that is weird if the local network is ok. Are you able to reproduce it? If yes, could you please also capture a verbose log for us? This would be really helpful since we still can't reproduce it on our machines.

EugeneOne1 avatar Jan 26 '23 13:01 EugeneOne1

I noticed the same thing and the issue seems to be if "Use AdGuard browsing security web service" is enabled or not. I recreated this by blocking the internet for one of my adguard VMs. With "Use AdGuard browsing security web service" enabled, local lookups are not performed, when I disabled it everything works without a problem.

Attached is the verbose log file when "Use AdGuard browsing security web service" is enabled. adgh-browsing_security_enabled.log

namob avatar Mar 15 '23 11:03 namob

@abdalians, hello again and apologies for late response. It actually seems AdGuard Home still serves local DNS zones, resolving the requests with appropriate local data, at least I can see some answered plain PTR requests for local addresses. All the other requests are indeed being dropped due to Safe Browsing services failure, even preventing those to be answered from cache. We have a feature request (#2857) about improving the implementation of the Safe Browsing / Parental Control services, but for now it terminates the request processing on failure.

Could you please check a few special cases:

  • Add a $dnsrewrite entry with some improbable domain name to your custom filtering rules, something like:

    ||not-a-real.domain^$dnsrewrite=NOERROR;A;1.2.3.4
    

    And after the network lost try to request it. Should be resolved properly regardless of the Safe Browsing services state;

  • Try to request some domain from the /etc/hosts file, they should be resolved as well.

AFAIK, AdGuard Home isn't responsible for any other local data in your setup (DHCP seems being disabled, and the only local resolver is loopback, so RDNS also has no additional info), so if the above is answered, the problem is Safe Browsing services reachability.

@EugeneOne1 : I have my own local domain.com being served by BIND, inside the local network, and since Adguard home is the primary resolver for all dns clients in the network, I had a rule to send domain.com to BIND dns server.

[/domain.com/]192.168.10.5 (https://github.com/AdguardTeam/AdGuardHome/wiki/Configuration#upstreams-for-domains);

When the internet drops (fails over to the secondary Internet connection), Adguard simply stops responding to any dns queries. Even the local BIND name resolution seizes to function.

I do have a workaround implemented for this now: BIND: Listening on 127.0.0.1 Adguard: Listening on lan IP (192.168.10.5 in my case) For ALL DNS requests, I point adguard to 127.0.0.1 as upstream.

image

and then from Bind Upstream I have my chosen Upstream DNS providers.

** The asterisks here in my setup is I have dual WAN, so while my internet is actually not down, just failed over to my secondary, Adguard home refuses to resolve anything including the local domains.

abdalians avatar Apr 17 '23 22:04 abdalians

Still an issue... New Adguard Home user and as soon as WAN goes down, none of the DNS rewrites work anymore.

Nslookup shows the rewrite is working, as long as WAN is up.

sammyke007 avatar May 11 '23 20:05 sammyke007

Still happening to me as well

fuomag9 avatar Dec 09 '23 14:12 fuomag9

I noticed the same thing and the issue seems to be if "Use AdGuard browsing security web service" is enabled or not. I recreated this by blocking the internet for one of my adguard VMs. With "Use AdGuard browsing security web service" enabled, local lookups are not performed, when I disabled it everything works without a problem.

Attached is the verbose log file when "Use AdGuard browsing security web service" is enabled. adgh-browsing_security_enabled.log

In my case they were all disabled

image

fuomag9 avatar Dec 09 '23 14:12 fuomag9

@abdalians, @sammyke007, @fuomag9, @james-1987, could you please capture the verbose log for us? Unfortunately, we still can't reproduce it. It would also be helpful to look at the exact moment the network went down, if that can be done manually. Note that safe browsing and parental control features should be disabled, as it actually breaks the resolution under these circumstances.

The logs could be sent to [email protected].

EugeneOne1 avatar Dec 12 '23 13:12 EugeneOne1

For me it was fixed by using Unbound as upstream DNS for my internal network:

Upstream DNS settings: https://dns10.quad9.net/dns-query [/in-addr.arpa/]192.168.1.1:5553 [/ip6.arpa/]192.168.1.1:5553 [/localdom/]192.168.1.1:5553

and Private reverse DNS servers: 192.168.1.1:5553

sammyke007 avatar Dec 12 '23 17:12 sammyke007

My home internet is currently down. Wasn't able to access my network via local DNS. If I disabled AGH protection, local DNS works. My solution was to add @@||mydomain.tld^ to the custom filtering rules. Immediately started resolving again.

themanbornwithin avatar Dec 21 '23 14:12 themanbornwithin