riptide_cpp icon indicating copy to clipboard operation
riptide_cpp copied to clipboard

SDS read/write routines fail to detect errors

Open jack-pappas opened this issue 5 years ago • 1 comments

A user who had an issue with filesystem corruption mentioned that calling rt.load_sds(...) on a corrupted file succeeded but returned uninitialized/garbage data. The user is running riptide_cpp / riptable on Linux, and confirmed that if they use dd to read the file, the OS and filesystem detect that certain parts of the file (corresponding to corrupted blocks in the filesystem storage) are bad and return error codes.

I had a look at the SDS code, and it appears we're not checking whether calls to pread() or pwrite() succeed (for the Linux/Mac implementations of SDSReadFileChunk / SDSWriteFileChunk). The user made the corrupted file available through a samba share and I'm able to "successfully" read the corrupted file on Windows too. I haven't yet verified whether samba is passing through the error codes or not; I will follow up on this, but we should also verify that code consuming SDSReadFileChunk / SDSWriteFileChunk is checking for error conditions and bubbling them up as expected (on all platforms).

jack-pappas avatar Oct 19 '20 16:10 jack-pappas

The change merged from #11 adds detection of read/write failures on non-Windows platforms.

However, on all platforms, the SDS code should propagate these failures upwards, so we can eventually raise a Python error message and prevent returning anything from rt.load_sds(). (Ideally, we'd also keep track of which arrays were affected by I/O failures so we can provide the end-user with that information too.)

jack-pappas avatar Oct 19 '20 21:10 jack-pappas