daru
daru copied to clipboard
Allow 'mask' dataframes and vectors into where clause
This issue is complimentary to #209 .
The where=
clause will only accept BoolArray as the first argument. However, it should also be able to accept dataframes that contain vector names similar to that of the calling dataframe and make the assignment only where there's a true
against an index. Something like a 'mask' dataframe.
For example, consider this dataframe:
df = Daru::DataFrame.new({
a: [1,2,3,4,5],
b: [400,500,200,1,5]
})
df_mask = Daru::DataFrame.new({
a: [true, false, false, true, false],
b: [false, false true, true, false]
})
The following syntax:
df.where(df_mask) = -1000
...should yield the following dataframe:
Daru::DataFrame.new({
a: [-1000,2,3,-1000,5],
b: [400,500,-1000,-1000,5]
})
I didn't understand this syntax df.where(df_mask) = -1000
I think df.where(df_mask)
must return :
Daru::DataFrame.new({
a: [1,NaN,NaN,4,NaN],
b: [NaN,NaN,200,1,NaN]
})
And then we must have some method to replace NaN
with -1000
, isn't it ?