foundation icon indicating copy to clipboard operation
foundation copied to clipboard

Add ArrayUArray data collection

Open vincenthz opened this issue 8 years ago • 9 comments

Possible definition of ArrayUArray is:

    import Foundation.Array
    data ArrayUArray ty = ArrayUArray (Array (UArray ty))

It should be more efficient than a simple array for many things like streaming, appending in the begin/middle/end

vincenthz avatar Jul 24 '16 12:07 vincenthz

some discussion notes:

Need to have an unboxed and boxed version, point similarities with deque, and document the characteristic differences with array / list (e.g. : o(chunks) for length, faster append ..)

vincenthz avatar Sep 03 '16 12:09 vincenthz

A handy picture which can help fixing this in memory (no pun intended):

deque

Apart from the begin and end pointers, this resemble what we are trying to achieve.

adinapoli avatar Sep 03 '16 13:09 adinapoli

@vincenthz Maybe this is a totally dumb idea (it came to me under the shower 😂 ) but I was thinking that we could enhance the information we store in the data structure to also provide the "iteration direction", this way:

-- True: iterate forward
-- False: iterate backward
type IterDirection = Bool
data ArrayUArray ty = ArrayUArray {-# UNPACK #-} !IterDirection (Array (UArray ty))

this way, reversing an ArrayUArray would be an O(1) operation, as easy as yielding a copy of the data structure with the bit flipped. Then all the internal functions would take into account this flag to decide how to scan the data structure for operations; iterating backward would be to go the last cell of the Array and then access each UArray from the end to the front.

Playing this into my head seems like it could work and it's also thread safe. Do you think this approach is gonna fly? Are you aware of any other data structure (in literature) using this approach?

adinapoli avatar Sep 04 '16 16:09 adinapoli

@adinapoli I've seen this done before, and even generalised to having an arbitrary permutation. I would declare it a failure - there is a reasonable cost to checking/branching on it, and it's not useful that often. In contrast slicing is very useful.

ndmitchell avatar Sep 04 '16 17:09 ndmitchell

That said, this was directly on an array - perhaps on an array of array it makes some sense - but I can't immediately see the win.

ndmitchell avatar Sep 04 '16 17:09 ndmitchell

Ah, perfect, thanks @ndmitchell ! I guess we should stick to the KISS principle with this one, seems the best course of action for now.

but I can't immediately see the win.

The win I was seeing here is that you would avoid reallocating the chunks. In case of a reverse, we would need to allocate new chunks and copy the content over (reversed), whereas with the flag we won't. But maybe that's a marginal gain shadowed by the cost of branching, as you pointed out!

The second benefit would be data sharing (as in most persistent data structures): Writing this, theoretically, should result in sharing all the underlying UArrays, and allocating only a new "cell" holding the reference to the very same arrays plus the flipped iterator bit:

let myA = A.fromList [1,2,3] :: ArrayUArray Int8
let myB = A.reverse myA

adinapoli avatar Sep 04 '16 17:09 adinapoli

Reverse tends to be a pretty rare operation, especially if you have a builder.

ndmitchell avatar Sep 04 '16 17:09 ndmitchell

(Vincent's here, no internet on my phone waiting in the airport) hmm quickly it doesn't seems very useful to have the direction of the array. I never seen this kind of thing be used, and open to be convinced, but I m a bit dubious what you'll get out of it.

One approach I just thought about reading this (and maybe equally crazy) would be a Reversed property that could be a Collection if the underlying type is a collection:

newtype Reversed a = Reversed a
instance Sequential a => Sequential (Reversed a) where ...

NicolasDP avatar Sep 04 '16 17:09 NicolasDP

No sweat guys, it seems we have reached a consensus then 😀 -- have a safe travel back!

adinapoli avatar Sep 04 '16 18:09 adinapoli