DataArrays.jl icon indicating copy to clipboard operation
DataArrays.jl copied to clipboard

What should be the output of logical operators for PooledDataArray objects?

Open heliosdrm opened this issue 9 years ago • 2 comments

Now it is another PooledDataArray of Bool elements (possible levels: true or false).

I wonder if that really makes sense, or it should be just a DataArray of Bool. There are some operations, like element-wise logical operators, that do not work with PooledDataArray, so the current behaviour is problematic. A short example below (Julia 0.3.4 for Windows 64-bit, DataArrays 0.2.14).

julia> x = @pdata(["A","A","B","B"])
4-element PooledDataArray{ASCIIString,Uint32,1}:
 "A"
 "A"
 "B"
 "B"

julia> y = @data([1,2,1,3])
4-element DataArray{Int64,1}:
 1
 2
 1
 3

julia> x .== "A"
4-element PooledDataArray{Bool,Uint32,1}:
  true
  true
 false
 false

julia> y .< 2
4-element DataArray{Bool,1}:
  true
 false
  true
 false

julia> (x .== "A") & (y .< 2)
ERROR: `&` has no method matching &(::Array{Bool,1}, ::PooledDataArray{Bool,Uint32,1})
 in & at D:\.julia\v0.3\DataArrays\src\operators.jl:543

heliosdrm avatar May 21 '15 08:05 heliosdrm

I would say we should just special-case ban PooledDataArray{Bool}. If you wanted to use it as factor, it already defines its own dummy representation. And it wastes storage since a bool costs less than any index into a pool of 2 values would.

johnmyleswhite avatar May 21 '15 15:05 johnmyleswhite

Actually, the error here is rather that logical operators shouldn't return a PDA, but a standard Array{Bool} or a BitArray. We could ban PooledDataArray{Bool}, but that's a different issue.

nalimilan avatar Sep 12 '15 13:09 nalimilan