riptable icon indicating copy to clipboard operation
riptable copied to clipboard

``rt.load_sds`` sometimes returns incorrect data for multi-section files + mask/filter

Open jack-pappas opened this issue 4 years ago • 0 comments

I ran into an issue when calling rt.load_sds() when specifying the filter= kwarg where it returns an array with the wrong shape and/or returns incorrect data.

I've been able to narrow the scope of the problem down to:

  • There's a multi-section SDS file (created via rt.Dataset.save(..., append=True)).
  • Some columns are not present in all sections of the file.
  • rt.load_sds(..., stack=True, filter=...) is being used to load the data from the file. The include= kwarg is also specifying a column subset (perhaps even just a single column), and the subset consists entirely of columns which aren't present in all sections of the file.
  • The mask (boolean array) being supplied to the filter= parameter of the call masks out all rows in at least one section of the file.

I've been able to reproduce the issues in isolation by creating an extended version of the unit test I created to reproduce #138 -- I'll mark that as xfail then commit it to make the issue easier to dig into when someone has the time.

jack-pappas avatar Jun 03 '21 19:06 jack-pappas