primitive icon indicating copy to clipboard operation
primitive copied to clipboard

indexArrayM docs are misleading

Open chessai opened this issue 7 years ago • 8 comments

the indexArrayM docs read:

-- | Monadically read a value from the immutable array at the given index.
-- This allows us to be strict in the array while remaining lazy in the read
-- element which is very useful for collective operations. Suppose we want to
-- copy an array. We could do something like this:
--
-- > copy marr arr ... = do ...
-- >                        writeArray marr i (indexArray arr i) ...
-- >                        ...
--
-- But since primitive arrays are lazy, the calls to 'indexArray' will not be
-- evaluated. Rather, @marr@ will be filled with thunks each of which would
-- retain a reference to @arr@. This is definitely not what we want!
--
-- With 'indexArrayM', we can instead write
--
-- > copy marr arr ... = do ...
-- >                        x <- indexArrayM arr i
-- >                        writeArray marr i x
-- >                        ...
--
-- Now, indexing is executed immediately although the returned element is
-- still not evaluated.
--

Something about this function, which I feel should be mentioned in the docs, is that one should usually (as i imagine, at least) wish to be strict in indexing into an array. Consider a situation in which one indexes into an array strictly, and no references to the array exist. Then, when GC occurs, the array will be GC'd. However, if the indexing is lazy, the indexing could potentially not have occured yet, and even if no references to the array exist, the array will not be GC'd. In most cases i think users would prefer indexArray## for this reason. The only Monads for which this isn't really a problem are those strict in the left argument of bind (e.g. IO, ST), since binding the result of indexArrayM to a computation therein will actually force the indexing to happen.

Another reason lazy indexing into an array might be undesirable is that indexing is so cheap for everything in this library, i can't see laziness being beneficial in most situations involving primitive. If people agree that this should be mentioned in the docs, I would like to put up a PR.

chessai avatar Oct 17 '18 15:10 chessai

Yes, perhaps we should mention that it's only really useful for monads with sufficiently strict >>= (which are the only ones that are strictly law-abiding). Indeed, it works just fine for a huge number of Monad instances. Not just IO and ST s, but also Maybe, Either e, [], Tree, etc. Indeed, it works well for strict transformations of those too: ContT, ReaderT, strict StateT, etc. It's only weird stuff like lazy StateT, lazy Writer, and lazy ST where it's the wrong thing.

treeowl avatar Oct 17 '18 16:10 treeowl

the Identity monad plus IndexM works for that example, unless I'm (quite likely!) confused by what you mean to communicate

i'm slighlty confused by what @chessai means to say: namely, the indexing IS strict, for indexM.

Is the concern here that a user might write code which has pointless thunks for reading the old array as the entries of the new array?

oh: if you want to do a lazy read of the thunk IN the older array and move it into the new one, but not force it if its not be forced yet, GOTCHA.

IndexM is defined as follows

indexArrayM :: Monad m => Array a -> Int -> m a
{-# INLINE indexArrayM #-}
indexArrayM arr (I# i#)
  = case indexArray# (array# arr) i# of (# x #) -> return x

so it does exactly whats needed here, so a user could write code with IndexM in the identity monad to accomplish this today. Or what am i missing?

cartazio avatar Oct 18 '18 23:10 cartazio

@cartazio, I believe he's referring to things like lazy ST, lazy StateT, lazy WriterT, and TardisT.

treeowl avatar Oct 19 '18 01:10 treeowl

@cartazio, I believe he's referring to things like lazy ST, lazy StateT, lazy WriterT, and TardisT.

Yes.

Indeed, it works just fine for a huge number of Monad instances. Not just IO and ST s, but also Maybe, Either e, [], Tree, etc. Indeed, it works well for strict transformations of those too: ContT, ReaderT, strict StateT, etc. It's only weird stuff like lazy StateT, lazy Writer, and lazy ST where it's the wrong thing.

Oh, yeah. Makes sense.

chessai avatar Oct 19 '18 17:10 chessai

i'm kinda thinking that maybe having consolidated "tips/tricks/gotchas" document we can reference from the read me / or something might be a good way to deal with these and other things

eg

  1. certain exotic/strange monads where theres reentrant control flow may violate assumptions needed for primmonad code to behave well, care's needed in those cases

  2. certain exotic/strange monads have a "weakly violating" subset of monad laws, aka "lazy bind", so there may be increased space usage in certain programs, "heres how to work around those"

and a few other things?

cartazio avatar Oct 19 '18 22:10 cartazio

@cartazio

so it does exactly whats needed here, so a user could write code with IndexM in the identity monad to accomplish this today. Or what am i missing?

At least, for the identity monad, indexArrayM is NOT strict, as the following example shows.

$ ghci
GHCi, version 8.4.4: http://www.haskell.org/ghc/  :? for help
Prelude> import Data.Primitive.Array
Prelude Data.Primitive.Array> import Data.Functor.Identity
Prelude Data.Primitive.Array Data.Functor.Identity>  (indexArrayM undefined undefined >>= (\x -> pure ())) :: Identity ()
Identity () -- Does not raise an error.

This is because the bind operator for the identity monad does not evaluate the left operand.

-- | @since 4.8.0.0
instance Monad Identity where
    m >>= k  = k (runIdentity m)

source

I think it is worth remarking in the document to avoid using the identity monad for indexArrayM.

autotaker avatar Jan 23 '19 06:01 autotaker

Well yeah, the identity monad in base / transformers is lazy. I have a strict identity package on hackage which shouldn’t have that issue.

It’s be more accurate / helpful to note this is a problem with any lazy Monad in general.

I’m more interested in improving documentation for this in vector rather than in primitive. But open to both. But either way, this quickly becomes a duplicated documentation problem, in which case I’d rather have it in transformers or base.

On Wed, Jan 23, 2019 at 1:00 AM autotaker [email protected] wrote:

@cartazio https://github.com/cartazio

so it does exactly whats needed here, so a user could write code with IndexM in the identity monad to accomplish this today. Or what am i missing?

At least, for the identity monad, indexArrayM is NOT strict, as the following example shows.

$ ghciGHCi, version 8.4.4: http://www.haskell.org/ghc/ :? for helpPrelude> import Data.Primitive.ArrayPrelude Data.Primitive.Array> import Data.Functor.IdentityPrelude Data.Primitive.Array Data.Functor.Identity> (indexArrayM undefined undefined >>= (\x -> pure ())) :: Identity ()Identity () -- Does not raise an error.

This is because the bind operator for the identity monad does not evaluate the left operand.

-- | @since 4.8.0.0instance Monad Identity where m >>= k = k (runIdentity m)

source http://hackage.haskell.org/package/base-4.11.0.0/docs/src/Data.Functor.Identity.html#line-113

I think it is worth remarking in the document to avoid using the identity monad for indexArrayM.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/haskell/primitive/issues/213#issuecomment-456680934, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwmXxqkuJMzGBiVhDzhfLKmmoOzNmks5vF_p5gaJpZM4XkNN2 .

cartazio avatar Jan 23 '19 18:01 cartazio

The crux of the matter and my concern is that this sort of documentation can create its own maintenance debt. And lots of dulication.

On Wed, Jan 23, 2019 at 1:05 PM Carter Schonwald [email protected] wrote:

Well yeah, the identity monad in base / transformers is lazy. I have a strict identity package on hackage which shouldn’t have that issue.

It’s be more accurate / helpful to note this is a problem with any lazy Monad in general.

I’m more interested in improving documentation for this in vector rather than in primitive. But open to both. But either way, this quickly becomes a duplicated documentation problem, in which case I’d rather have it in transformers or base.

On Wed, Jan 23, 2019 at 1:00 AM autotaker [email protected] wrote:

@cartazio https://github.com/cartazio

so it does exactly whats needed here, so a user could write code with IndexM in the identity monad to accomplish this today. Or what am i missing?

At least, for the identity monad, indexArrayM is NOT strict, as the following example shows.

$ ghciGHCi, version 8.4.4: http://www.haskell.org/ghc/ :? for helpPrelude> import Data.Primitive.ArrayPrelude Data.Primitive.Array> import Data.Functor.IdentityPrelude Data.Primitive.Array Data.Functor.Identity> (indexArrayM undefined undefined >>= (\x -> pure ())) :: Identity ()Identity () -- Does not raise an error.

This is because the bind operator for the identity monad does not evaluate the left operand.

-- | @since 4.8.0.0instance Monad Identity where m >>= k = k (runIdentity m)

source http://hackage.haskell.org/package/base-4.11.0.0/docs/src/Data.Functor.Identity.html#line-113

I think it is worth remarking in the document to avoid using the identity monad for indexArrayM.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/haskell/primitive/issues/213#issuecomment-456680934, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwmXxqkuJMzGBiVhDzhfLKmmoOzNmks5vF_p5gaJpZM4XkNN2 .

cartazio avatar Jan 23 '19 18:01 cartazio