r8152 icon indicating copy to clipboard operation
r8152 copied to clipboard

rtd1296 stability issue

Open bb-qq opened this issue 1 year ago • 56 comments

This issue summarizes the topic of the driver not working on the rtd1296 platform.

There are many reports of unstable operation in products using rtd1296. The typical symptoms reported are as follows

  • Driver installation succeeds without problems.
  • NAS works stably when receiving traffic
  • Connection is dropped when the NAS is sending traffic
  • May operate stably when linked at 1 Gbps
    • ethtool -s ethX speed 1000 duplex full 
  • The exact same symptoms occur with both r8152 and aqc111 drivers

There are also no reports of stable operation.

When disconnected, there seems to be something wrong at the USB level. This may indicate that the rtd1296 SoC may have some software or hardware issues with the xHCI host controller.

I am looking for a workaround for this problem, but so far have not found it. (I am also considering providing a standard usb-cdc driver separately.) I will report here if any progress is made in the investigation.

Affected Products

  • DS420j, DS220j
  • RS819
  • DS418(no plus), DS218(no plus), DS218play, DS118

bb-qq avatar Dec 04 '22 02:12 bb-qq

In my case the problem occurs only when transferring large files from NAS to PC. I conversely can navigate the NAS directory structure for hours without the problem occurs.

One side note: after installation I immediately changed the MTU size to 9000 in the NAS LAN configuration. After I found the problem, I tried to reset MTU to 1500 (default), also unchecking the manual setting, but after saving this setting it still remains enabled with the value of 9000, at least as shown in the GUI. No way to reset. But it may be only a GUI issue.

Anyway, because of the evidence that only large transfers cause the problem, it might have something to do with the MTU?

andrus2049 avatar Dec 04 '22 08:12 andrus2049

Anyway, because of the evidence that only large transfers cause the problem, it might have something to do with the MTU?

MTU may have something to do with this stability issue, but there are reports of problems occurring even with MTU values of 1500.

It might possibly relate to the hardware-assisted functions of the transmission on the NIC side.

I would like to know if disabling those features by the following command will make a difference in stability.

ethtool -K eth2 tso off
ethtool -K eth2 gso off
ethtool -K eth2 sg off

bb-qq avatar Dec 10 '22 09:12 bb-qq

ethtool -K eth2 tso off
ethtool -K eth2 gso off
ethtool -K eth2 sg off

Thanks, going to try.

How to check which are the current values before issueing these commands?

And are these new values reversed upon NAS restart or are they persistent?

andrus2049 avatar Dec 11 '22 17:12 andrus2049

I tested the connection with the suggested changes. It's a pity, but nothing has changed. DS218 & rtl8156 2.5

NikitaOsotsky avatar Dec 12 '22 09:12 NikitaOsotsky

I also tried ipv6 access https://[fe80::XXXX:XXXX:XXXX:XXXX]:5001/ I tried to download the file and it didn't help either

NikitaOsotsky avatar Dec 12 '22 09:12 NikitaOsotsky

DS920+ 2.16.3-3 DSM7.x (reuploaded) lan rtd1296 (ks-is ks-714) https://ks-is.com/usb-3-1-ethernet-adapter-ks-is-ks-714?tag=2.5G

There are no problems with data transfer. Especially for a couple of hours I drove chia 100gb plots at a speed of 2.5. But there is another problem! When you pull out and put back the adapter, the driver turns off. The same goes for rebooting. Must be manually enabled in the web interface.

rtd1296 is the cpu for entry level synology, but not for DS920+.

cqwangding avatar Dec 14 '22 22:12 cqwangding

ethtool -K eth2 tso off
ethtool -K eth2 gso off
ethtool -K eth2 sg off

Using a DS418 with a TRENDnet TUC-ET2G and the r8152 driver. This managed to get me 2.5Gb speeds briefly (and for longer than it would previously hold a connection at all), but the connection ultimately shut down. It does seem like I was getting 2500mbps upload and only about 1000mbps download.

jebug29 avatar Dec 24 '22 05:12 jebug29

Has there been any updates for DS418 with TRENDnet 2.5G USB-C to RJ-45? I got this and thought before I looked on here. My expectation was this was going to work. Yet I am seeing the issues with the drivers above. I ran the SSH after it failed and then saw the connection under network. Connected yet it was a 169. address . I am also using a TRENDnet 5-Port Unmanaged 2.5G PoE+switch with its own AC Adapter. After a reboot its completely gone. I had to run the RT App when I rebooted as it did not auto restart.

After assigning a static IP, I am now showing: 2500mbps Full Duplex 1500 MTU

I will put it to test with a few file transfers small and large tomorrow when I get up.

dlbomber1974 avatar Dec 30 '22 04:12 dlbomber1974

Thanks all, it looks like disabling GSO and TSO didn't make much difference in stability.

These settings will revert after reboot. If they have any effect, please register them in the task scheduler or something so that they are configured at startup.

bb-qq avatar Dec 31 '22 04:12 bb-qq

Same here with my 218play. I tried two different adapters with 8152 chipset (none of the recommended adapters yet). After some research, I found on the internet that the error is very common. It can be seen well in the /var/log/kern.log. Unfortunately, I could not find a solution to the problem. The error occurs with large amounts of data. The connection is interrupted for about 45 seconds, the NAS is then also not accessible via ping.

This is what the kern.log looks like: 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.442279] r8152 3-1:1.0 eth1: Tx timeout 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.448949] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.453431] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.457911] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:46+01:00 diskstation kernel: [470548.462397] r8152 3-1:1.0 eth1: Tx status -2 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.434430] r8152 3-1:1.0 eth1: get_registers -108 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.439501] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.444479] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.449441] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.454439] r8152 3-1:1.0 eth1: get_registers -71 2023-01-06T20:03:48+01:00 diskstation kernel: [470550.459401] r8152 3-1:1.0 eth1: get_registers -71

I operate it with 1 Gbps, not with 2,5 Gbps. So Your workaround ("May operate stably when linked at 1 Gbps") has no effect here.

Next week I will get the club 3D USB adapter with 8156 chipset. I will test it and report if the error also occurs.

As written before, there are some articles and forums about this topic, here are some of them, don't know if it could help in our environment:

https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html

https://forum.odroid.com/viewtopic.php?f=212&t=45857

https://bugzilla.kernel.org/show_bug.cgi?id=198931

And by the way: I tested another adapter with 8169 chipset (together with Your 8152 driver). It worked, but with a poor performance (about 30 MB/s).

Dayofwonder avatar Jan 06 '23 20:01 Dayofwonder

Coming back to test my NAS DS418 with my Trendnet 2.5gbe setup I saw where my connection showed connected still but I had no ping and the port was non-responsive. I saw the mac address but no even after several reboots, uninstall, reinstall etc... I read some other places where this has occurred so I had to dust off my old Linux had and found a short remedy for this. I did notice regardless of me setting MTU to 9000 in the GUI it still is showing up as MTU 1500.

sudo /etc/rc.network restart

My connection back up , IP now showing and pingable. eth2 Link encap:Ethernet HWaddr 3C:8C:F8:60:0A:94 inet addr:192.168.1.201 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:59735 errors:0 dropped:0 overruns:0 frame:0 TX packets:39 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3237395 (3.0 MiB) TX bytes:9563 (9.3 KiB)

Now I will go forward with my testing.

dlbomber1974 avatar Jan 08 '23 04:01 dlbomber1974

Coming back to test my NAS DS418 with my Trendnet 2.5gbe setup I saw where my connection showed connected still but I had no ping and the port was non-responsive. I saw the mac address but no even after several reboots, uninstall, reinstall etc... I read some other places where this has occurred so I had to dust off my old Linux had and found a short remedy for this. I did notice regardless of me setting MTU to 9000 in the GUI it still is showing up as MTU 1500.

sudo /etc/rc.network restart

Yes, this is ONE way. For me it works to stop and restart the installed driver in the package center ... But this isn't a workaround as long as I won't be able to download any file from the NAS.

Dayofwonder avatar Jan 08 '23 12:01 Dayofwonder

Just tested with Club 3D USB adapter. Test failed. Upload speed is a disaster (worst of all devices).

image

And downloads don't start at all. 2023-01-10T17:21:49+01:00 diskstation kernel: [806423.856275] r8152 3-1:1.0 eth1: Tx timeout 2023-01-10T17:21:49+01:00 diskstation kernel: [806423.862963] r8152 3-1:1.0 eth1: Tx status -2 2023-01-10T17:21:51+01:00 diskstation kernel: [806425.820037] r8152 3-1:1.0 eth1: get_registers -108 2023-01-10T17:21:51+01:00 diskstation kernel: [806425.825106] r8152 3-1:1.0 eth1: get_registers -71 2023-01-10T17:21:51+01:00 diskstation kernel: [806425.870979] xhci-hcd xhci-hcd.2.auto: URB transfer length is wrong, xHC issue? req. len = 4, act. len = 4294967292

So: None of my 3 different adapters do the trick.

Dayofwonder avatar Jan 10 '23 16:01 Dayofwonder

For now I use 2 connections: LAN for download from NAS, USB for uploads. I observe quite a good performance for uploads, about 80-120 MB/s (well, of cause this is no 2,5 gps speed, but more than before and might be limited by my HDDs) with this adapter: https://www.digitec.ch/de/s1/product/digitus-usb-type-c-gigabit-ethernet-adapter-25g-usb-c-usb-a-usb3130-usb-c-usb-31-netzwerkadapter-16185124 As written before: As far as I try any download via USB, the USB connection crashes and I will have to stop and restart the driver in the package center.

Dayofwonder avatar Jan 10 '23 18:01 Dayofwonder

Driver version 2.15.0-10 tested last night on 418j, and met with the same issue when download large file over 500mb from nas. Download failed and Nas show no response to ping. On my case, reconnect the lan wire between nas and router do fix the No-response situation, But The download issue is repeatable. I tried linking the nas and PC directly with one wire, And met with the same issue. Link the wire to the onboard 1g port of the nas, and everything is ok.

Voidnickyname avatar Jan 20 '23 02:01 Voidnickyname

I tested both adapters from CableMatters and ASUS, on DS220j and got the same behavior, uploads work as expected but downloads make the adapter hang. Tested with iperf3. Screenshot 2023-01-30 at 12 18 13 Screenshot 2023-01-30 at 12 18 22

jvalenciag avatar Jan 30 '23 17:01 jvalenciag

Same here with DS220j. Uploading goes fine even with large files. Downloading it breaks instantaneously. Testing adapter with chipset RTL8156B.

javitoalon avatar Feb 05 '23 19:02 javitoalon

The same problem on model DS220j, like many who have already written here. Works unstable. Especially if you give the load on the interface. If it helps in any way, I could help with testing and even provide access to my device, for example, through a mesh network.

alex-arzner-pro avatar Feb 21 '23 21:02 alex-arzner-pro

Did someone follow these instructions named "How to fix Realtek USB NIC TX timeout issues"? https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html

I am not THAT big linux specialist and hesitate to try this workaround ...

Dayofwonder avatar Feb 27 '23 12:02 Dayofwonder

Did someone follow these instructions named "How to fix Realtek USB NIC TX timeout issues"? https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html

I am not THAT big linux specialist and hesitate to try this workaround ...

The kernel is already disabling AutoSuspend USB Power Mode:

$ cat /sys/module/usbcore/parameters/autosuspend -1

Romeo1984 avatar Feb 27 '23 19:02 Romeo1984

I can give it a whirl this afternoon. I am pretty savvy with Linux. I’ll post feedback.

On Mon, Feb 27, 2023 at 2:03 PM Romeo1984 @.***> wrote:

Did someone follow these instructions named "How to fix Realtek USB NIC TX timeout issues"? https://portal.cloudunboxed.net/knowledgebase/55/How-to-fix-Realtek-USB-NIC-TX-timeout-issues.html

I am not THAT big linux specialist and hesitate to try this workaround ...

The kernel is already disabling AutoSuspend USB Power Mode:

$ cat /sys/module/usbcore/parameters/autosuspend -1

— Reply to this email directly, view it on GitHub https://github.com/bb-qq/r8152/issues/275#issuecomment-1446892503, or unsubscribe https://github.com/notifications/unsubscribe-auth/APKQWR4M5IY2O3Z4KJ35XALWZT27LANCNFSM6AAAAAASTCUFFE . You are receiving this because you commented.Message ID: @.***>

dlbomber1974 avatar Feb 27 '23 21:02 dlbomber1974

I looked at this. Synology OS does not use Grub. I couldn’t find any articles on how to pass kernel parameters.

Romeo1984 avatar Feb 27 '23 21:02 Romeo1984

I hadn’t had a chance to look at it. Grub is not supported. Is that the only way the article gives?

On Mon, Feb 27, 2023 at 4:36 PM Romeo1984 @.***> wrote:

I looked at this. Synology OS does not use Grub. I couldn’t find any articles on how to pass kernel parameters.

— Reply to this email directly, view it on GitHub https://github.com/bb-qq/r8152/issues/275#issuecomment-1447130140, or unsubscribe https://github.com/notifications/unsubscribe-auth/APKQWR6FUQUSLTE7NK4VFNLWZUM4JANCNFSM6AAAAAASTCUFFE . You are receiving this because you commented.Message ID: @.***>

dlbomber1974 avatar Feb 27 '23 21:02 dlbomber1974

Yes.

Romeo1984 avatar Feb 27 '23 21:02 Romeo1984

I do have a question. Are you guys directing specific traffic to the USB 3 port or did you bind the port?

On Mon, Feb 27, 2023 at 4:54 PM Romeo1984 @.***> wrote:

Yes.

— Reply to this email directly, view it on GitHub https://github.com/bb-qq/r8152/issues/275#issuecomment-1447161193, or unsubscribe https://github.com/notifications/unsubscribe-auth/APKQWRZA3Q4S4FAMLO7U4V3WZUPA7ANCNFSM6AAAAAASTCUFFE . You are receiving this because you commented.Message ID: @.***>

dlbomber1974 avatar Feb 27 '23 22:02 dlbomber1974

I have not configured anything else. In order to be able to use the higher speed at all, I use the second IP address (USB3) for uploading files (films, pictures) to my NAS from the network, and for any download, i.e. the normal retrieval of files, I use the normal LAN port.

Dayofwonder avatar Feb 28 '23 08:02 Dayofwonder

Potentially Stupid question: Has anybody tried running the 2.5Ge adapter from a powered USB hub? I have this running perfectly on my DS720+ this way. I am wondering if this is a power issue somehow. I heard reports that the DS220j has underpowered USB ports. I would like to hear if somebody has tried it, if not, I might order the same one I am running on my DS720+ and just try it.

Romeo1984 avatar Feb 28 '23 20:02 Romeo1984

I just tried using a powered Dell usb dock and same result: uploads are fine, with downloads breaks. Weird thing is that internally you can still ping the 2.5Gbe but not from outside. After ifconfing eth1 down and up, it comes to normal again.

javitoalon avatar Mar 01 '23 14:03 javitoalon

Confirmed - I too tried a powered USB doc with the same result.

Romeo1984 avatar Mar 02 '23 20:03 Romeo1984

I have seen people reporting DS218j is working fine with an Asus dongle, both for upload and downloads. Is DS220j so different from it? I know DS218j is Armada38x but still, it is kind of weird it works so bad for our DS220j with regular RTL8156B usbs.

javitoalon avatar Mar 03 '23 12:03 javitoalon