stratisd icon indicating copy to clipboard operation
stratisd copied to clipboard

Generic error message returned after trying to create a filesystem when the MDV is full

Open drckeefe opened this issue 5 years ago • 4 comments

stratis fs create p1 2259-buckets

Execution failed: stratisd failed to perform the operation that you requested. It returned the following information via the D-Bus: ERROR: other os error.

Setup:

  1. Make sure to have enough physical storage (10TB should be enough)
  2. stratis pool create p1 /dev/
  3. Create filesystems until "other os error" is displayed. for fs in $(seq 2 20000);do stratis fs create p1 $fs-buckets;done

drckeefe avatar Nov 20 '19 04:11 drckeefe

I found that using VDO to provide data optimization helps reduce the physical storage needed for this test to about 10GB.

Physical storage is /dev/sdb

  1. vdo create --name=vdo1 --device=/dev/sdb --vdoLogicalSize=20TB
  2. stratis pool create p1 /dev/mapper/vdo1
  3. create about 2300 filesystems on this pool using the command in the first comment.

Side note: In this setup the number of filesystems needed to recreate the issues is less because VDO is a 4K block device. Each filesystem has an associated json file which the MDV allocates a block for. The MDV usable size is about 12MB. If a 512 byte block device is used then it is possible to create more json files in the MDV and there for more filesystems per pool.

drckeefe avatar Nov 20 '19 04:11 drckeefe

Version of stratisd and stratis-cli used 2.0.0

drckeefe avatar Nov 20 '19 15:11 drckeefe

My guess is that we can search the code base for something like io::ErrorKind::Other to see if we are returning this anywhere. If we aren't, we'll have to do some digging to find out where this is being passed up to stratisd and try to detect it and give a better message.

jbaublitz avatar Nov 20 '19 15:11 jbaublitz

@drckeefe I think I found where we're running into issues. It should be one of three places: here, here, or here. My guess would be that it's the create_dir() call given that seems to be more likely to fail than either of the latter two (initialization or removing a file). Maybe we want to enrich all of the error messages, not just the one that's a problem currently, with slightly more information. Even something like a reference to MDV or what the actual operation was that failed would be helpful instead of just why it failed.

jbaublitz avatar Nov 20 '19 15:11 jbaublitz