AdGuardHome Unstable work as DoH behind Nginx reverse server with Keenetic

Unstable work as DoH behind Nginx reverse server with Keenetic

Open savely-krasovsky opened this issue 3 years ago • 19 comments

Have a question or an idea? Please search it on our forum to make sure it was not yet asked. If you cannot find what you had in mind, please submit it here.

Prerequisites

Please answer the following questions for yourself before submitting an issue. YOU MAY DELETE THE PREREQUISITES SECTION.

[x] I am running the latest version
[x] I checked the documentation and found no answer
[x] I checked to make sure that this issue has not already been filed

Issue Details

I am running AdGuard Home at the personal VPS behind Nginx reverse proxy and use it at my Keenetic router. I find it unstable while using DoH. I am getting query timeouts, keenetic's DNS DoH proxy shuts down randomly, some hosts cannot be resolved at all (even while protection is off, ofc), etc. I understand that this could be related to Keenetic itself, but Keenetic works great with Cloudflare and Google public DoH with 0 problem.

Version of AdGuard Home server:
- v0.106.3
How did you install AdGuard Home:
- GitHub releases
How did you setup DNS configuration:
- Router (DoH)
If it's a router or IoT, please write device model:
- VPS with 2 cores and 4GB of RAM
CPU architecture:
- x86
Operating system and version:
- Debian 10

Expected Behavior

Works as good as any other public DNS-server.

Actual Behavior

Unstable behaviour. Timeouts and some problems which lead to DoH server proxy crashes at Keenetic side.

Screenshots

Linux machine inside Keenetic LAN which uses it's DNS. It always stucks like this. With Cloudflare/Google DoH it works otherwise like a charm:

Keenetics https-dns-module restarts every time with some error:

Additional Information

Nginx configuration:

server {
        listen 80;
        listen [::]:80;
        server_name adguard.example.com;

        return 301 https://adguard.example.com$request_uri;
}

server {
        listen 443 ssl;
        listen [::]:443 ssl;
        server_name adguard.example.com;

        access_log      /var/log/nginx/adguard.access.log;
        error_log       /var/log/nginx/adguard.error.log;

        ssl_trusted_certificate /etc/letsencrypt/live/example.com/chain.pem;
        ssl_certificate         /etc/letsencrypt/live/example.com/fullchain.pem;
        ssl_certificate_key     /etc/letsencrypt/live/example.com/privkey.pem;

        gzip off;

        location / {
                add_header X-Robots-Tag 'noindex';

                proxy_set_header Host $host;
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Proto $scheme;

                proxy_pass http://127.0.0.1:3000;
        }

        location /robots.txt {
                return 200 "User-agent: *\nDisallow: /\n";
        }
}

I can also provide access to the server itself to test it by own.

Jun 09 '21 14:06 savely-krasovsky

@L11R & @ainar-g, I too have been in contact with Keenetic support about this issue: https://yadi.sk/i/R_YimY5nPRWakg

But, as I realized, this applies to any servers in any configuration and is not a error. Anyway, I'm not sure about that🤔

Jun 10 '21 07:06 ammnt

Hello and thank you for your report. Could you please add the following information:

What upstreams do you use? Does the issue persist if you use other upstreams?
Can you please [configure] AGH to collect verbose logs and send them to us at [email protected] with the subject line “AdGuard Home issue 3250”?

Thanks!

Jun 10 '21 14:06 ainar-g

@ainar-g I use this ones:

  upstream_dns:
  - 1.1.1.1
  - 1.0.0.1
  - 8.8.8.8
  - 8.8.4.4

I've enabled verbose logs, but how much of them you need? The problem appears only after few days of running AGH (sort of leak? buffers overloading? idk)

Jun 10 '21 15:06 savely-krasovsky

Thanks for the info! It would be the best to get the logs from the day when the problems start. If you can pinpoint the exact hour, it would be nice to have logs for one hour before and after that. Thanks!

Jun 10 '21 15:06 ainar-g

@ainar-g I've captured some logs. How can I send you them privately?

Jun 18 '21 07:06 savely-krasovsky

@L11R, we have an e-mail for those: [email protected].

Jun 18 '21 08:06 EugeneOne1

@L11R, we've recently committed some fixes for the DoH implementation. Could you also try the latest betas, like v0.107.0-b.4? They seem to fix issues for a lot of people who use DoH.

Jul 08 '21 11:07 ainar-g

@ainar-g I have already wanted to write here about those improvments! I installed b3 and b4 almost a week ago and by now I don't see any issues, will continue to observe.

Jul 08 '21 11:07 savely-krasovsky

Oh, got this loop at Keenetic again.

I sent you the latest logs.

After restarting (systemctl restart AdGuardHome.service) Keenetic started to resolve domains again without its restart.

Jul 12 '21 08:07 savely-krasovsky

@L11R, thanks, we've received the logs, although I cannot currently tell you when we'll be able to properly scan through it. In your personal estimate, has the issue at least become less frequent after our latest fix, or is it still as frequent as it was before?

Jul 15 '21 12:07 ainar-g

It less frequent, for sure. After the latest incident it still works.

Jul 15 '21 12:07 savely-krasovsky

@ainar-g I found out that domain login.live.com cannot be resolved in my setup (again, Keenetic embedded DNS server -> external AdGuard Home server -> upstream DNS servers).

Home PC behind Keenetic DNS just getting timeouts, while at ADH side everything seems ok (at least logs reporting about successful query):

PS C:\Users\Savely> nslookup login.live.com
╤хЁтхЁ:  UnKnown
Address:  192.168.1.1

DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.

For me it seems like Keenetic (and maybe other DNS servers) cannot handle such a big answers. I tried to compare DNS answers from AdGuard Home and directly from something like CF DoH. Results: As you can see the response in case of ADH is much larger.

After decoding I found that ADH returns 146 answer enties: It could return less records sometime with the same request: ...or even more: For example cloudflare-dns.com everytime returns only 12 records: You can test it yourself with this DNS-message: q80BAAABAAAAAAAABWxvZ2luBGxpdmUDY29tAAABAAE

I'm completely new to DNS, so I may have said something stupid, but I'm tired of this setup not working properly as daily driver :(

Aug 11 '21 22:08 savely-krasovsky

Same problem with www.outlook.com (q80BAAABAAAAAAAAA3d3dwdvdXRsb29rA2NvbQAAAQAB). Also huge response size difference.

Aug 11 '21 22:08 savely-krasovsky

@L11R I cannot reproduce it with 1.1.1.1 or 8.8.8.8, but there's definitely something going on with the way CNAMEs are resolved somewhere in the chain. Looking at the logs you've sent, I can see that the large responses often come from the 127.0.0.2:53 upstream. Are the large responses that you're receiving now also come from that upstream? Can you try using one of well-known upstreams and also set the cache size to zero to exclude any bad cached results?

Aug 12 '21 13:08 ainar-g

Hm, I will try.

Aug 12 '21 13:08 savely-krasovsky

@ainar-g I changed it to 1.1.1.1 and 1.0.0.1 since you have highlighted it 4 days ago. I don't remember when I set it to local resolver... The problem with timeouts has gone. But I am still getting random NXDOMAIN from Keenetic, ADH logs seems to be fine. Keenetic logs are also clear now. After some beta update behavior is definitely changed.

I notice that ADH has a tendency to increase average processing time by the time. Currently it's 43. Yesterday was 29. Today it's already hard to use PC as usual. I am refreshing pages every 10 minutes at least to get them work (or work them properly with all assets).

Aug 16 '21 22:08 savely-krasovsky

Have you reset the cache size to a non-zero value after setting the proper upstreams? Because if not, AGH is literally pinging the upstreams every time you make a request. If yes, try increasing it so that cache is used more effectively. You could also try and enabling the recently added optimistic caching mode.

Aug 17 '21 10:08 ainar-g

@ainar-g no, I kept it zero. But anyway it's strange, isn't it? Today I have literally 10 minutes of DNS not working at all. ~~Totally clean logs at Keenetic side~~ again getting randomly: Service: "DoT "System" UDP-to-TCP proxy #0": unexpectedly stopped. Logs from ADH will send by email.

Aug 17 '21 11:08 savely-krasovsky

@L11R Hi! Sorry for such a long silence. Is this issue still relevant?

Sep 02 '22 11:09 Birbber

AdGuardHome AdGuardHome copied to clipboard

Unstable work as DoH behind Nginx reverse server with Keenetic

Prerequisites

Issue Details

Expected Behavior

Actual Behavior

Screenshots

Additional Information

AdGuardHome
AdGuardHome copied to clipboard