DistributedArrays.jl icon indicating copy to clipboard operation
DistributedArrays.jl copied to clipboard

Arrays with Missing Hunks

Open r-barnes opened this issue 7 years ago • 11 comments

Is it feasible that DArray could support distributed arrays in which pieces are missing?

For instance, say I have the following:

X X    X
X X X X
X X

where each X represents a dense portion of an array and the blanks represent a portion which can be modeled as representing a single value throughout.

In my mind, DArray has some sort of address table and, when it does a lookup, it could note the gap and return an appropriately formatted response using the constant value.

r-barnes avatar Oct 01 '18 18:10 r-barnes

Hm yes, that seems feasible. Sounds like you would want to use something like a https://github.com/JuliaArrays/FillArrays.jl, the only problem is that we don't support heterogeneous chunk-types. #170 as an example is actually about ensuring that the chunk-types by construction are consistent.

It might be feasible to do promote to a joint super-type, but that might break other things like similar. So yes feasible in theory, but not yet in practice.

vchuravy avatar Oct 01 '18 18:10 vchuravy

Just for reference: for the data I'm currently working with it's the difference between 700 GB and 2 TB of RAM.

FillArrays looks pretty perfect. I wonder if you have suggestions of places I could look in the codebase if I wanted to try hacking on this?

r-barnes avatar Oct 01 '18 19:10 r-barnes

#170 modifies exactly the core constructor that you would want to modify. Once you have a DArray that is constructed in the way you want it comes with the challenge of making sure that all the operations you are interested in work.

vchuravy avatar Oct 01 '18 19:10 vchuravy

I'll take a dig.

There might be a cheap-hack approach, though: having multiple sections of the DArray reference the same underlying memory:

@everywhere using DistributedArrays
r1  = @spawnat 2 zeros(4,4) 
r2  = @spawnat 3 zeros(4,4) 
r3  = @spawnat 4 rand(4,4)
ras = [r1 r2; r3 r3]
D   = DArray(ras)

Unfortunately, my experiments in #183 make me nervous about this.

r-barnes avatar Oct 01 '18 19:10 r-barnes

Hm yes I suspect that there might be quite some functions written with the assumption that each chunk will be on a different processor, even though it clearly doesn't need to be assumed.

vchuravy avatar Oct 01 '18 20:10 vchuravy

The following seems to work for me:

@everywhere using DistributedArrays
@everywhere using FillArrays
r1  = @spawnat 2 FillArrays.Zeros(4,4) 
r2  = @spawnat 3 FillArrays.Zeros(4,4) 
r3  = @spawnat 4 rand(4,4)
r5  = @spawnat 5 rand(4,4)
ras = [r1 r3; r5 r2]
D   = DArray(ras)

[@fetchfrom p typeof(D[:L]) for p in workers()]
[@fetchfrom p eltype(D[:L]) for p in workers()]

And this can be done even with #170 in place.

r-barnes avatar Oct 01 '18 21:10 r-barnes

Hm that fails for me on current master:

D   = DArray(ras)
ERROR: MethodError: no method matching Zeros{Float64,2,Tuple{Int64,Int64}}(::Array{Float64,2})
Stacktrace:
 [1] empty_localpart(::Type, ::Int64, ::Type) at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:66
 [2] macro expansion at ./task.jl:264 [inlined]
 [3] macro expansion at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:84 [inlined]
 [4] macro expansion at ./task.jl:244 [inlined]
 [5] DArray(::Tuple{Int64,Int64}, ::Array{Future,2}, ::Tuple{Int64,Int64}, ::Array{Int64,2}, ::Array{Tuple{UnitRange{Int64},UnitRange{Int64}},2}, ::Array{Array{Int64,1},1}) at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:82
 [6] macro expansion at ./task.jl:266 [inlined]
 [7] macro expansion at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:194 [inlined]
 [8] macro expansion at ./task.jl:244 [inlined]
 [9] DArray(::Array{Future,2}) at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:192
 [10] top-level scope at none:0

vchuravy avatar Oct 01 '18 21:10 vchuravy

Ah it works on #175..., but we need a method to select the "right" array-type. Since right now it could happen that it chooses a different T as a primary eltype and I think the process local version of this have diverged...

vchuravy avatar Oct 01 '18 21:10 vchuravy

I'm running [aaf54ef3] DistributedArrays v0.5.1 right now.

(Is there a good development cycle for working from the master branch of a local repo?)

Isn't eltype the type of the data held by an array? It seems as though that could be enforced constant across the entire DArray, even if the containers holding the elements differ.

r-barnes avatar Oct 01 '18 22:10 r-barnes

Is it possible to make the type more general to encompass the possibility of submatrices of different types?

r-barnes avatar Oct 01 '18 22:10 r-barnes

(Is there a good development cycle for working from the master branch of a local repo?)

I generally use ]dev for that which can also take a local path

Isn't eltype the type of the data held by an array? It seems as though that could be enforced constant across the entire DArray, even if the containers holding the elements differ.

Yes the eltype being consitent is even more important and that is an invariant we need to uphold.

Is it possible to make the type more general to encompass the possibility of submatrices of different types?

There are two alternatives:

a = Fill(1.0, 2, 2)
b = zeros(2, 2)

AT = Union{typeof(a), typeof(b)}

and then teach DArray to reason about this union (immediate question would be what is similar(AT))

Or promote_type(typeof(a), typeof(b)) which is an AbstractArray{Float64, 2} so that could work quite well.

vchuravy avatar Oct 01 '18 22:10 vchuravy