daru icon indicating copy to clipboard operation
daru copied to clipboard

Allow 'mask' dataframes and vectors into where clause

Open v0dro opened this issue 8 years ago • 1 comments

This issue is complimentary to #209 .

The where= clause will only accept BoolArray as the first argument. However, it should also be able to accept dataframes that contain vector names similar to that of the calling dataframe and make the assignment only where there's a true against an index. Something like a 'mask' dataframe.

For example, consider this dataframe:

df  = Daru::DataFrame.new({
  a: [1,2,3,4,5],
  b: [400,500,200,1,5]
})

df_mask = Daru::DataFrame.new({
  a: [true, false, false, true, false],
  b: [false, false true, true, false]
})

The following syntax:

df.where(df_mask) = -1000

...should yield the following dataframe:

Daru::DataFrame.new({
  a: [-1000,2,3,-1000,5],
  b: [400,500,-1000,-1000,5]
})

v0dro avatar Aug 05 '16 07:08 v0dro

I didn't understand this syntax df.where(df_mask) = -1000

I think df.where(df_mask) must return :

Daru::DataFrame.new({
  a: [1,NaN,NaN,4,NaN],
  b: [NaN,NaN,200,1,NaN]
})

And then we must have some method to replace NaN with -1000, isn't it ?

Shekharrajak avatar Mar 19 '17 16:03 Shekharrajak