tftpy
tftpy copied to clipboard
Lost Packet handling
When tftpy runs as a server and a data packet is lost, all further requests for the lost packet are just logged as duplicate ACKs by tftpy. If I add
self.sendDAT()
right after
elif pkt.blocknumber < self.context.next_block:
log.debug("Received duplicate ACK for block %d"
% pkt.blocknumber)
self.context.metrics.add_dup(pkt)
the transfer continues properly after packet loss. I don't know if it is an error of tftpy or of my tftp client (u-boot bootloader), but the problem does not occur with other tftp servers.
Regards, Thomas
Hmm. So we received the ACK already and sent the DAT, but perhaps the ACK from the client was lost.
The reason we don't send a DAT in response to a duplicate ACK is the sorceror's apprentice problem. So how can I safely send a DAT there without introducing that problem?
I've proposed a pull request for this. "Implement retransmission of packet on peer request". https://github.com/msoulier/tftpy/pull/58
The reported issue is actually due to an incomplete implementation of the protocol as per RFC 1350. In summary, duplicate recieve of the previous peer packet is a request for resend of the own previous packet as per the protocol.
The pull request handle this.
I've tested it on other clients such as u-boot tftp client and it fixes the issue.
FWIW, I believe I have seen this same problem when doing U-Boot tftp transfers. Have not tested the PR but could probably give it a whirl to see if it increases robustness when on a congested network.
Indeed, u-boot tftp client, implements a timeout policy waiting for peer packets, and when a timeout occurs, ask for the expected packet to be resent as described in the protocol.
This unimplemented tftpy_server behavior (packet retransmissoin) will not occur when testing the tftpy_server with tftpy own client (tftpy_client) as tftpy_client is simpler and just waits peer packet forever.
More precisely, as soon as the client timeout is lower then the tftpy_server timeout, the client will keep sending requests for resending the previous packet. As the server sees traffic on the socket (for packet it wrongly drops), then it itself never gets into a timeout and never resend by itself the packet. At the end the client never receives any more packet from the server and the transmission is completly stucked.
See: #67 commit: Improve lost DAT handling
This fixes the problem without re-introducing the sorceror's apprentice problem. The solution is not to update the timeout counter when a duplicate ACK packet is received, so the server timeout waits for "progress" not "any traffic".
Hi @msoulier, is there any chance to get the fix by @navlrac merged?
Thanks!
Hi,
It seems to be that I have the U-Boot issue describe by @guillon is tftpy 0.8.1 fix this problem?
Hi,
Here attempt to resolve PR #67 merge conflicts : PR #121
@msoulier and @navlrac assess I didn't break anythings and if I didn't please merge ASAP.
Thanks
@msoulier please help us getting this issue fixed, it is causing quite a lot of trouble in our CI pipeline. If the fix by @BuhtigithuB is fine, please merge it.
@PiQuer, my attempt to resolve the conflict wasn't successful and I didn't had time to retry. If you have some time you can try it strating from @navlrac #67 PR
With this cherry-pick : https://github.com/BuhtigithuB/tftpy/tree/navlrac-windowsize-rfc-7440-support-only
Tests pass... So cherry pick succeeded... Although when I add the navlrac tests for it rfc 7440 feature there is 3 tests that start to fail which are not even navlrac's tests... I am a bit puzzled...
I think I didn't fetch upstream before attempt the cherry pick :( so I am not up to date... Will try to update my repo and rebase...
Fetched from Github UI and it didn't end in merge conflict :sweat_smile:
Can you gave a look?
I will push a clone of this branch with the failing tests navlrac tests so you can also have look at the failing tests
Here the branch with navlrac tests https://github.com/BuhtigithuB/tftpy/tree/navlrac-windowsize-rfc-7440-support-only-with-failling-tests
Here you have all commits but 2 that seems dated : https://github.com/BuhtigithuB/tftpy/tree/navlrac-windowsize-all-cherry-picks
@msoulier : Waiting for your input to make a proper PR that could be merged
UP!
Let me know if I can help you any further...
UP!
@msoulier do you intent to include navlrac code into tftpy? Is it going to solve the instability issue identified with ubtoo?? Can we expect progress here? I am happy to help more if I can, but please give us some feedback...
Frankly I need to set aside some time to reproduce this problem in a test case, and then evaluate the fix. I've been very busy.
I will try to find some time.
I can help with testing and printout if I can... I will try the resolved merge conflict version with our setup if it helps fixing the instability issue that we have currently with tftpy and uboot...
Thanks for the feedback.
On Fri, Nov 12, 2021 at 10:07 AM Michael P. Soulier < @.***> wrote:
Frankly I need to set aside some time to reproduce this problem in a test case, and then evaluate the fix. I've been very busy.
I will try to find some time.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/msoulier/tftpy/issues/41#issuecomment-967186839, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMVNRYSENCA5R5EDC4RCR3ULUUSXANCNFSM4ALXJNGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I need to take a little time to see if I can automate a test for this. I think we do something similarly already, so I think so.
Mike
On 2021-11-12 4:10 p.m., BuhtigithuB wrote:
I can help with testing and printout if I can... I will try the resolved merge conflict version with our setup if it helps fixing the instability issue that we have currently with tftpy and uboot...
Thanks for the feedback.