linstor-gateway
linstor-gateway copied to clipboard
Timeout creating large(?) nfs export on slow (hdd) storage
While trying to create a large export on HDD backed storage I get this error report:
linstor-gateway nfs create ls-nfs 1.2.3.4/24 512G -r pve-hdd -f ext4
ERROR REPORT 67E9C422-A21AF-000007
============================================================
Application: LINBIT® LINSTOR
Module: Satellite
Version: 1.30.4
Build ID: bef74a44609cb592c5efad2e707b50e696623c61
Build time: 2025-02-03T15:48:28+00:00
Error time: 2025-03-31 01:40:14
Node: node-4
Thread: DeviceManager
============================================================
Reported error:
===============
Category: LinStorException
Class name: StorageException
Class canonical name: com.linbit.linstor.storage.StorageException
Generated at: Method 'genericExecutor', Source file 'Commands.java', Line #120
Error message: Failed to mfks /dev/drbd1026
Error context:
An error occurred while processing resource 'Node: 'node-4', Rsc: 'ls-nfs''
ErrorContext:
Cause: External command timed out
Details: External command: mkfs.ext4 -q -E nodiscard -E root_owner=65534:65534 /dev/drbd1026
Call backtrace:
Method Native Class:Line number
genericExecutor N com.linbit.linstor.storage.utils.Commands:120
genericExecutor N com.linbit.linstor.storage.utils.Commands:63
genericExecutor N com.linbit.linstor.storage.utils.Commands:51
makeFs N com.linbit.linstor.storage.utils.MkfsUtils:96
makeExt4 N com.linbit.linstor.storage.utils.MkfsUtils:109
makeFileSystemOnMarked N com.linbit.linstor.storage.utils.MkfsUtils:222
condInitialOrSkipSync N com.linbit.linstor.layer.drbd.DrbdLayer:1714
adjustDrbd N com.linbit.linstor.layer.drbd.DrbdLayer:743
processResource N com.linbit.linstor.layer.drbd.DrbdLayer:249
lambda$processResource$4 N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:1368
processGeneric N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:1411
processResource N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:1364
processResources N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:386
dispatchResources N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:228
dispatchResources N com.linbit.linstor.core.devmgr.DeviceManagerImpl:333
phaseDispatchDeviceHandlers N com.linbit.linstor.core.devmgr.DeviceManagerImpl:1148
devMgrLoop N com.linbit.linstor.core.devmgr.DeviceManagerImpl:778
run N com.linbit.linstor.core.devmgr.DeviceManagerImpl:674
run N java.lang.Thread:840
Caused by:
==========
Category: Exception
Class name: ChildProcessTimeoutException
Class canonical name: com.linbit.ChildProcessTimeoutException
Generated at: Method 'waitFor', Source file 'ChildProcessHandler.java', Line #133
Call backtrace:
Method Native Class:Line number
waitFor N com.linbit.extproc.ChildProcessHandler:133
syncProcess N com.linbit.extproc.ExtCmd:160
exec N com.linbit.extproc.ExtCmd:92
genericExecutor N com.linbit.linstor.storage.utils.Commands:79
genericExecutor N com.linbit.linstor.storage.utils.Commands:63
genericExecutor N com.linbit.linstor.storage.utils.Commands:51
makeFs N com.linbit.linstor.storage.utils.MkfsUtils:96
makeExt4 N com.linbit.linstor.storage.utils.MkfsUtils:109
makeFileSystemOnMarked N com.linbit.linstor.storage.utils.MkfsUtils:222
condInitialOrSkipSync N com.linbit.linstor.layer.drbd.DrbdLayer:1714
adjustDrbd N com.linbit.linstor.layer.drbd.DrbdLayer:743
processResource N com.linbit.linstor.layer.drbd.DrbdLayer:249
lambda$processResource$4 N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:1368
processGeneric N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:1411
processResource N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:1364
processResources N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:386
dispatchResources N com.linbit.linstor.core.devmgr.DeviceHandlerImpl:228
dispatchResources N com.linbit.linstor.core.devmgr.DeviceManagerImpl:333
phaseDispatchDeviceHandlers N com.linbit.linstor.core.devmgr.DeviceManagerImpl:1148
devMgrLoop N com.linbit.linstor.core.devmgr.DeviceManagerImpl:778
run N com.linbit.linstor.core.devmgr.DeviceManagerImpl:674
run N java.lang.Thread:840
END OF ERROR REPORT.
however,
linstor-gateway nfs create ls-nfs 1.2.3.4/24 512G -r pve-hdd -f xfs
is completing successfully. So I assume it really is a timeout issue, though.
The timeout on the mkfs command is 45 seconds, which is pretty long for "only" a 512G disk. Some debugging questions to figure this out:
- How long does the second command (
nfs create ... -f xfs) take? - What about if you do a
mkfs.ext4on the raw disks, without LINSTOR and DRBD involved? How long does that take? - If that completes relatively quickly, try a
mkfs.ext4with-E nodiscardto see if that has an effect. - What is the network latency and bandwidth between these nodes?
Thanks!