mdns icon indicating copy to clipboard operation
mdns copied to clipboard

Sending the question back with the answer

Open atomirex opened this issue 1 year ago • 5 comments

Description

This includes the question that provoked the answer in the response containing the answer.

This enables Android clients to resolve hostname.local domains published with this package.

Reference issue

This is necessary for certain mdns clients (the Android dnsresolver, primarily) to pick up the answer to their question. This was established by inspecting what Bonjour was doing on a Mac which allowed it to be recognised. (The Mac will also do things like bundle A and AAAA responses for ipv4 and ipv6, but that didn't prove to be necessary to resolve my problem).

This is more for reference of the bug and change made than a serious request this be merged.

atomirex avatar Nov 24 '24 16:11 atomirex

Codecov Report

:x: Patch coverage is 79.86111% with 29 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 66.23%. Comparing base (7a7184c) to head (339008c). :warning: Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
conn.go 79.86% 21 Missing and 8 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #210      +/-   ##
==========================================
+ Coverage   65.95%   66.23%   +0.27%     
==========================================
  Files           1        1              
  Lines         702      690      -12     
==========================================
- Hits          463      457       -6     
+ Misses        162      161       -1     
+ Partials       77       72       -5     
Flag Coverage Δ
go 66.23% <79.86%> (+0.27%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Nov 26 '24 05:11 codecov[bot]

@atomirex I am fine with merging this! If you can fix the imports and update the tests the more things we are compatible with the better!

Thanks for looking into this.

Sean-Der avatar Nov 26 '24 05:11 Sean-Der

I will now investigate those test failures. It looks plausibly like the problem motivating further changes on the other fork - because the answer now includes a question some safeguards may be needed in cases of loopback.

It's curious the same test locally is fine.

atomirex avatar Nov 26 '24 14:11 atomirex

Not being a DNS expert I decided to look into if this is the right thing to do.

It turns out the DNS RFC that applies (1035 and following) did not explicitly call out this behaviour (i.e. there isn't a MUST or even SHOULD on the subject), but it was implied, assumed and used in the various examples.

The result is apparently most DNS servers do follow this convention, but clients need to handle the case where they don't.

So I am going to need to adjust the code here on the client side, including adding tests for both cases (where the response has the Question and where it does not). This should also help smooth over situations where people update code using pion/mdns non atomically as it were.

atomirex avatar Dec 20 '24 15:12 atomirex

Reopening this for review with some pre-emptive remarks.

Firstly, I expected this to be fairly simple, but it proved not to be, and the result is I think this is an awkward halfway house.

The underlying problem is there are/were some incorrect assumptions in the existing code:

  1. Often Answers are accompanied by echoes of the Questions that provoked them (resolved here)
  2. Some clients expect the echoes of the Question or will not recognise the Answer (resolved here)
  3. Answering for A and AAAA within a single message is expected in response to a message with both incoming Questions (not resolved)

The way dnsmessage.Parser was used was problematic because it made assumptions about being able to identify how to handle a message while parsing it in order, which is not the case, so it has been replaced by unpacking everything. (My operating assumption here is if that is in any way a hot path then you have other problems).

Additionally the testing is becoming difficult.

That longer timeout bothers me, and I couldn't establish any reasonable cause for why it is necessary. I verified that it is needed on master too, and it is, so this problem is not introduced by these changes.

It would be nice to be able to separate the message response generation logic from any actual network calls for testing, but this would be a more substantial refactoring than I was planning on taking on as a first PR here. This would help eliminate the ugly new Config field, but also enable separating logical execution from problems which may be caused by the local network environment.

I have a fork at https://github.com/atomirex/mdns which solves item 3 from the above list, and supports per interface IPs for hosts (I have a use case for this), but in doing so changes the interface. (Curiously that now also has the timeout test problem). It might be worth bringing over more of that code, but it is far less optimised than this has been to date, for example I am not sure this would fit in over here: https://github.com/atomirex/mdns/blob/c03a99ec05d1a2a7e7e5feabc13e21c0eec8599c/conn.go#L1025

I am curious for any input as to how to proceed. Thanks!

atomirex avatar Dec 23 '24 18:12 atomirex

I'm working on a project using pion/mdns and am hitting this same issue with Android clients. I've tested this PR and it does fix the problem! If there's more work to be done on this PR I'm happy to pitch in as I really need Android to be supported by my application.

jgkawell avatar Oct 11 '25 23:10 jgkawell

@atomirex @jgkawell yes we can get this merged once it's updated, sorry that it got ghosted by mistake!

JoTurk avatar Oct 12 '25 02:10 JoTurk

@JoeTurki Is the only thing that needs updating a resolution to the merge conflicts?

@atomirex Are you able to get this PR in a mergeable state? If not, just let me know and I'll open one of my own with these changes just based off of the current master.

jgkawell avatar Nov 11 '25 05:11 jgkawell

@jgkawell Want me to add you to the repo? You should be able to push directly!

Maintainers are allowed to edit this pull request. is on

Sean-Der avatar Nov 11 '25 22:11 Sean-Der

@JoeTurki Is the only thing that needs updating a resolution to the merge conflicts?

I think yes, as sean said we can add you to the org so that you can fix it in this PR, also if you want to cherry pick the commit and make a new PR with the fixes with @atomirex commit this also works.

JoTurk avatar Nov 11 '25 22:11 JoTurk

@Sean-Der yeah go ahead and add me to the repo. That way we keep the history all in one place.

jgkawell avatar Nov 12 '25 03:11 jgkawell

@jgkawell i sent you invite.

JoTurk avatar Nov 12 '25 03:11 JoTurk

@JoeTurki Pushed up the conflict resolution. Tests are passing but I still want to run through a manual test with an Android device to make sure things work end-to-end.

The diff looks really bad but it's mostly just whitespace. I suggest turning off whitespace in the diff when reviewing.

jgkawell avatar Nov 12 '25 05:11 jgkawell

other than the comments, I would like to add few tests around that with new clients. but I can do that myself, thank you again. and welcome to the org :)

JoTurk avatar Nov 12 '25 11:11 JoTurk

great, I'll take a look at those. Might not be until this weekend though

jgkawell avatar Nov 13 '25 15:11 jgkawell

@JoeTurki Your comments made me take a closer look at the standard and I realized the initial design of this PR seemed to be a little out of alignment with it. Specifically in Section 6, paragraph 4:

Multicast DNS responses MUST NOT contain any questions in the Question Section. Any questions in the Question Section of a received Multicast DNS response MUST be silently ignored. Multicast DNS queriers receiving Multicast DNS responses do not care what question elicited the response; they care only that the information in the response is true and accurate.

The way this PR was initially written, there was a config value (DoNotEchoQueryWithAnswer) which defaulted to true and made all questions be returned in the Question Section of the response. This seems to directly violate the above part off the spec.

In my latest commits, I removed the config value and instead only return the question within the response if the request is unicast. This adheres to Section 6.7, paragraph 1:

In this case, the Multicast DNS responder MUST send a UDP response directly back to the querier, via unicast, to the query packet's source IP address and port. This unicast response MUST be a conventional unicast response as would be generated by a conventional Unicast DNS server; for example, it MUST repeat the query ID and the question given in the query message.

I removed the tests that were originally added in this PR that tested the new config value since I've now removed it but I added a TODO that we might want to add a test that forces a unicast query to check everything is working for those types of requests. I've run some manual tests with Windows, Linux, and Android clients and everything works with this method. I haven't tried any Apple devices due to lack of access to them. Android clients do in fact use unicast queries instead of multicast which explains why they were broken before.

jgkawell avatar Nov 16 '25 04:11 jgkawell

@jgkawell This looks great, thank you so much, did you test if it solves the original issue with android?

JoTurk avatar Nov 16 '25 16:11 JoTurk

@JoeTurki Yes! I tested it specifically with Android and the issue is resolved.

jgkawell avatar Nov 16 '25 20:11 jgkawell

@jgkawell @atomirex thank you a lot, merged and tagged as v2.1.0.

JoTurk avatar Nov 16 '25 22:11 JoTurk