DS918+ with Sabrent NT-SS5G - Random Crash, DSM unresponsive, Unsafe Shutdown
Description of the problem
Hi, Since mid of December, I am experiencing random crashes and disconnects on my DS918+ with my Sabrent NT-SS5G Adapter, using the latest driver (v. 1.3.3.0-10). The DSM itself becomes totally unresponsive, wouldn't allow me to stop/restart the driver in Package Center and after 1-2 minutes, suddenly crashes/restarts the whole NAS. The NAS informs that the system was shut down unsafely and starts Data Scrubbing once booted. This happens every 3-4 Days. Tried both the rear and front USB ports of the NAS, but the issue remained.
Description of your products
NAS: Synology DS918+ DSM: 7.1.1-42962 Update 3 Adapter: SABRENT NT-SS5G Driver: 1.3.3.0-10 DSM-7.x (reuploaded) RAM: 16GB Other USB Port used for: (UPS) CP1500EPFCLCD - Cyber Power System, Inc.
Description of your environment
Connection: From "DS918+" to PC's NIC "Marvell® AQtion AQC107 10Gb Ethernet" PC Motherboard: ASUS ROG MAXIMUS XII FORMULA Z490 PC OS: Windows 11 Pro 22H2 Ethernet Driver version: 3.1.7.0 Cable: VENTION 1m CAT 8 Ethernet Cable Connection used for: SMB, WinNUT-2.0 (UPS)
The adapter was working fine before December without any issues, could this be caused after the latest DSM Update 3? Hope you could help to fix this. Thank you!
Do you have any other USB devices connected, and what are the results of lsusb -a?
Hi, I am now using my previous 2.5G CLUB 3D CAC-1420 Adapter with the driver "r8152, 2.16.3-3 DSM7.x (reuploaded)", which works fine without any issues.
Only the Ethernet Adapter and the UPS are connected, nothing else. Please see below the output of lsusb:
|__usb1 1d6b:0002:0404 09 2.00 480MBit/s 0mA 1IF (Linux 4.4.180+ xhc i-hcd xHCI Host Controller 0000:00:15.0) hub |__1-3 0764:0501:0001 00 2.00 12MBit/s 2mA 1IF (CPS CP1500EPFCLCD CRXLW2000395) |__1-4 f400:f400:0100 00 2.00 480MBit/s 200mA 1IF (Synology DiskSta tion 7F008AFA20E41640) |__usb2 1d6b:0003:0404 09 3.00 5000MBit/s 0mA 1IF (Linux 4.4.180+ xhc i-hcd xHCI Host Controller 0000:00:15.0) hub |__2-2 0bda:8156:3000 00 3.20 5000MBit/s 512mA 1IF (Realtek USB 10/1 00/1G/2.5G LAN 000000001)
Hmmm, from the symptoms it looks like a problem with the NT-SS5G, you might want to connect it to your PC to see if there are any stability issues.
Or you could try the QNA-UC5G1T if you can return NT-SS5G. I am also using a DS918+ and this device is running stable.
Thank you, I followed your advice and ordered the QNA-UC5G1T. Will provide feedback in the next couple of days after testing.
So, I have returned the NT-SS5G and got the QNA-UC5G1T. It's running fine now for 24 hours without crashing. I will monitor this for at least a week and update you again. I have noticed that my max speed is 355-360 MB/s (SMB). If you are using a Windows PC, could you share the Network Adapter settings of your NIC in device manager? I could possibly tweak a little to get the full speed.
Providing iperf3 output: (Only getting a max of 355-360 MB/s (SMB) as mentioned above) OS: Windows 11 Pro 22H2
iperf3 -c 192.168.xx.xx -P 2 Connecting to host 192.168.xx.xx, port 5201 [ 4] local 192.168.yy.yy port 61286 connected to 192.168.xx.xx port 5201 [ 6] local 192.168.yy.yy port 61287 connected to 192.168.xx.xx port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 186 MBytes 1.56 Gbits/sec [ 6] 0.00-1.00 sec 186 MBytes 1.56 Gbits/sec [SUM] 0.00-1.00 sec 372 MBytes 3.12 Gbits/sec
[ 4] 1.00-2.00 sec 201 MBytes 1.69 Gbits/sec [ 6] 1.00-2.00 sec 200 MBytes 1.68 Gbits/sec [SUM] 1.00-2.00 sec 401 MBytes 3.37 Gbits/sec
[ 4] 2.00-3.00 sec 194 MBytes 1.63 Gbits/sec [ 6] 2.00-3.00 sec 190 MBytes 1.60 Gbits/sec [SUM] 2.00-3.00 sec 384 MBytes 3.22 Gbits/sec
[ 4] 3.00-4.00 sec 208 MBytes 1.74 Gbits/sec [ 6] 3.00-4.00 sec 206 MBytes 1.73 Gbits/sec [SUM] 3.00-4.00 sec 414 MBytes 3.47 Gbits/sec
[ 4] 4.00-5.00 sec 171 MBytes 1.43 Gbits/sec [ 6] 4.00-5.00 sec 170 MBytes 1.43 Gbits/sec [SUM] 4.00-5.00 sec 340 MBytes 2.86 Gbits/sec
[ 4] 5.00-6.00 sec 205 MBytes 1.72 Gbits/sec [ 6] 5.00-6.00 sec 204 MBytes 1.71 Gbits/sec [SUM] 5.00-6.00 sec 409 MBytes 3.43 Gbits/sec
[ 4] 6.00-7.00 sec 195 MBytes 1.64 Gbits/sec [ 6] 6.00-7.00 sec 194 MBytes 1.63 Gbits/sec [SUM] 6.00-7.00 sec 389 MBytes 3.27 Gbits/sec
[ 4] 7.00-8.00 sec 203 MBytes 1.70 Gbits/sec [ 6] 7.00-8.00 sec 202 MBytes 1.70 Gbits/sec [SUM] 7.00-8.00 sec 406 MBytes 3.40 Gbits/sec
[ 4] 8.00-9.00 sec 194 MBytes 1.62 Gbits/sec [ 6] 8.00-9.00 sec 192 MBytes 1.61 Gbits/sec [SUM] 8.00-9.00 sec 386 MBytes 3.23 Gbits/sec
[ 4] 9.00-10.00 sec 209 MBytes 1.75 Gbits/sec [ 6] 9.00-10.00 sec 208 MBytes 1.75 Gbits/sec [SUM] 9.00-10.00 sec 417 MBytes 3.50 Gbits/sec
[ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 1.92 GBytes 1.65 Gbits/sec sender [ 4] 0.00-10.00 sec 1.92 GBytes 1.65 Gbits/sec receiver [ 6] 0.00-10.00 sec 1.91 GBytes 1.64 Gbits/sec sender [ 6] 0.00-10.00 sec 1.91 GBytes 1.64 Gbits/sec receiver [SUM] 0.00-10.00 sec 3.83 GBytes 3.29 Gbits/sec sender [SUM] 0.00-10.00 sec 3.83 GBytes 3.29 Gbits/sec receiver
Update: Since my last post, it has disconnected 4 times, I had to manually stop the driver and start again. The good news: It didn't freeze, crash or restart my NAS. Have you encountered this problem?
I'm experiencing the same issue with my DS920+. I have also returned NT-SS5G and got QNA-UC5G1T. Then I even got the recommended SABRENT hub with power adapter, but the issue still persists. One time my NAS restarted by itself, so that was bad. But usually is just loses connection and I need to restart the driver. Most of the time I can restart the driver but sometimes it's just impossible to do this.
I have installed an older driver version "1.3.3.0-8 DSM-7.x. Working completely fine without a single crash or reboot since 25th February. See if that works for you.
Thanks. I have downgraded to 1.3.3.0-8. I kind of know what to do to make the driver crash so I'll test it out.
Nope, already had 2 improper shutdowns. Downgrading does not fix the issue for me.
Same here, just crashed the whole system, rebooted and started Data Scrubbing. I went back to the 2.5G Adapter now.
OK, now my 2.5G Adapter crashes too with the latest "r8152" driver. As I mentioned in my initial post, I believe something got messed up after the DSM (3) update.
I would love to hear from @bb-qq regarding this issue ? Is there a way I can help to pinpoint the problem ?
I am wondering how much traffic is flowing through the adapter before it becomes unstable. Heat might be causing the problem.
If you plugged that adapter into a Windows PC and kept the same amount of traffic flowing through it, would it work stably for an extended period of time?
I am also curious as to how much memory you have in your NAS.
The versions of the driver discussed in this thread include changes in kernel parameters related to memory, so it is possible that those changes are causing the problem.
I got 16GB Memory installed (2x 8GB) from Crucial. Traffic does not seem to be an issue for me as the driver randomly crashes even when transferring some photos or multiple documents. Another scenario, when I open Surveillance Station on my PC or backup using Synology Drive, then the driver randomly crashes too. I have tried the adapter on Windows 10 & 11 and copied multiple GB files without any issues, didn't crash.
The changes in Kernel Parameters could be true as the issue started with the DSM Update 3. Is there a fix for it?
I've got 20GB of RAM (4+16). I also don't think it's about the amount of traffic and temperature but I can't be 100% sure. For me crashes happen when I do something with webdav and plex. Like streaming from webdav server. But sometimes also just refreshing the metadata on plex. The only thing I can say about the temperature is that one time when it crashed I have touched the casing of QNA-UC5G1T and it was just barely warm. Is there a way to check the internal temperature of QNA-UC5G1T ? I do have both "Low Power 5G" and "Thermal throttling" set to ON to make sure the temperature is in check.
The changes in Kernel Parameters could be true as the issue started with the DSM Update 3. Is there a fix for it?
I was mentioning the changes on the driver's side. (https://github.com/bb-qq/aqc111/issues/96#issuecomment-1461841186) I don't know the details of the changes on the DSM side.
Is there a way to check the internal temperature of QNA-UC5G1T ?
As far as I know, there is no way to know the internal temperature. The only measure I can think of is to place it in a well-ventilated area and see the difference. (I saw a post once that said removing the case and installing a fan stabilized it, but I think it would be risky to go that far.)
I don't have any ideas to investigate the cause, but since your NAS seems to have much memory, could you try doubling the value of target_value with the /var/packages/aqc111/scripts/apply-memory-setting, although it is unlikely to improve the situation?
Thanks for your reply @bb-qq I have now doubled the target value and restarted the NAS. Will test it out and provide feedback.
`root@
set -eu
target_value=524288
current_value=sysctl -n vm.min_free_kbytes
if [ "${current_value}" -lt "${target_value}" ]
then
sysctl -w vm.min_free_kbytes=${target_value}
fi
root@
set -eu
target_value=1048576
current_value=sysctl -n vm.min_free_kbytes
if [ "${current_value}" -lt "${target_value}" ]
then
sysctl -w vm.min_free_kbytes=${target_value}
fi
root@
Hi @bb-qq - Whole NAS crashed in the morning. I turned the PC on and opened a file (Excel spreadsheet) via SMB, the adapter itself was cold, not even slightly warm and it crashed the whole NAS and rebooted. Upon boot, it started data scrubbing on the volume. Also want to mention, I ran the Memory Test via Synology Assistant last night and it passed without any errors. No idea what else I can do to troubleshoot.
Since you have the same Synology model, have you not encountered any of these issues yourself? Do you mind me asking what your specs are, i.e. Memory (Official/Unofficial), DSM version, NIC on the PC and the driver version of that. Not sure if my PC's NIC driver is probably causing these crashes. I am using the latest driver from Marvell (v3.1.7.0)
Since you have the same Synology model, have you not encountered any of these issues yourself?
I have experienced a few times a year when I did not have low power mode enabled on a device that the device would stop responding and I would have to reload the driver. However, I have never experienced a NAS crash.
Do you mind me asking what your specs are, i.e. Memory (Official/Unofficial), DSM version, NIC on the PC and the driver version of that.
My environment is as follows:
- Memory: Unofficial
$ sudo dmidecode --type memory
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.0.0 present.
Handle 0x0023, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: None
Maximum Capacity: 16 GB
Error Information Handle: No Error
Number Of Devices: 2
Handle 0x0024, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0023
Error Information Handle: No Error
Total Width: 8 bits
Data Width: 8 bits
Size: 8192 MB
Form Factor: SODIMM
Set: None
Locator: ChannelA-DIMM0
Bank Locator: BANK 0
Type: DDR3
Type Detail: Synchronous
Speed: 1600 MT/s
Manufacturer: Samsung
Serial Number: 35701618
Asset Tag: 9876543210
Part Number: M471B1G73BH0-YK0
Rank: Unknown
Configured Memory Speed: 1600 MT/s
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x0025, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x0023
Error Information Handle: No Error
Total Width: 8 bits
Data Width: 8 bits
Size: 8192 MB
Form Factor: SODIMM
Set: None
Locator: ChannelB-DIMM0
Bank Locator: BANK 1
Type: DDR3
Type Detail: Synchronous
Speed: 1600 MT/s
Manufacturer: Samsung
Serial Number: 35701618
Asset Tag: 9876543210
Part Number: M471B1G73BH0-YK0
Rank: Unknown
Configured Memory Speed: 1600 MT/s
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
- DSM version: 7.1.1-42962 Update 4
$ cat /etc/VERSION
majorversion="7"
minorversion="1"
major="7"
minor="1"
micro="1"
productversion="7.1.1"
buildphase="GM"
buildnumber="42962"
smallfixnumber="4"
nano="4"
base="42962"
builddate="2023/02/01"
buildtime="20:01:57"
- QNA-UC5G1T FW version: 3.1.6 (latest FW on the QNAP website)
- Connected USB port: front port with a stock cable
- PC NIC: AQN-107 (direct connection)
- PC NIC Driver: 2.2.3.0
Thank you - the specs look nearly identical to mine. The last option I could try is to update to the DSM 7.2 BETA version and see if that makes any difference. It would be great if you can provide an updated driver that will work with the 7.2 Beta. Thanks
I created drivers for the DSM 7.2 BETA, but I think it is unlikely that the DSM update will improve symptoms. https://github.com/bb-qq/aqc111/releases/tag/1.3.3.0-11
I wish I could at least find the cause of the reboot....
Thank you @bb-qq , appreciated. I have also ordered 2x 4GB Memory, which is the maximum supported Memory as per Intel's website for the INTEL Celeron J3455. Some users claim it won't utilise anything above 8GB or if it tries, the system crashes, so let me find out if this makes any difference. If you require any system outputs/logs from me, please let me know.
bb-qq already said he also has 2x8GB so I don't think that's it. I'm currently testing something and it's looking good. I'm going to stay with 1.3.3.0-10 while I test my thing. Btw how full is your system partition ( /dev/md0) ? df -h
@jaqb - Here you go. Looking forward to hearing about your test results. Does this look right?
root@:~# df -h /dev/md0 Filesystem Size Used Avail Use% Mounted on /dev/md0 2.3G 1.9G 365M 84% /
@jaqb - Just wondering, do you use your M.2 SSD as Cache or Volume? I had mine set up as volume for over a year and the aqc111 driver was installed on that volume (volume2) - Upon checking the log files (/var/log/messages), I found quite a few error messages related to volume2.
synostgvolume[840]: fs_btrfs_metadata_usage_query.c:137 Failed to check the btrfs metadata usage of volume [/volume2].
The above message is repeated multiple times. I have now removed volume2 and using it as a normal cache now. Also replaced the 16GB RAM with 2x 4GB. So far it runs stable, even booting/restarting the NAS is much faster than before. Will test and provide feedback.
84% used seems about right. I have now 82% but I had 100% couple of days ago so I had a lot of weird issues. Had to delete a bunch of logs to get this low.
I use 2x m.2 ssd's as cache for read-write.