vector
vector copied to clipboard
DV.Generic.Mutable.fill does not check for out of bounds access
So it should be either renamed as unsafeFill and/or bound checks added. It's possible this is not only one function with unsafe access.
module Fill where
import qualified Data.Vector.Unboxed.Mutable as MU
import qualified Data.Vector.Generic.Mutable as MG
import qualified Data.Vector.Fusion.Stream.Monadic as SS
main :: IO ()
main = do
mv <- MU.replicate 10 (1 :: Int)
MG.fill mv (SS.replicate 1000000 42)
return ()
[1 of 1] Compiling Fill ( /home/alexey/qqq/fill-sigsegv.hs, interpreted )
Ok, modules loaded: Fill.
*Fill> :main
Segmentation fault
So, I looked at this.
A possible issue is that I see no way to implement a safe fill
that doesn't do a lot of bounds checking. It just takes a Stream
, which has no size information, so we can't just check that we're big enough and call unsafeFill
. We have to check for each element.
I'll probably split it into two functions. It's listed as an "internal" function right now, so those uses will probably transition to unsafeFill
. Then there will be a user-visible fill
that does a bunch of arithmetic.
That's my plan, anyhow.
I think any function which could perform unchecked out of bounds access should be marked as unsafe to make clear that they can violate memory safety.
It probably make sense to expose unsafeFill as well.
considering the type of fill is
fill :: (PrimMonad m, MVector v a) => v (PrimState m) a -> Stream m a -> m (v (PrimState m) a)
,
and its under the header Internal operations
, i dont think this can be deemed a valid bug :)
If it is internal function it should be given sufficiently scary name or moved into some *.Internal module. Also no one reads all that small print about which function is internal and which is not. At least I don't.
Read the docs. It's under internal
We can't help you with literacy
On Monday, July 25, 2016, Aleksey Khudyakov [email protected] wrote:
If it is internal function it should be given sufficiently scary name or moved into some *.Internal module. Also no one reads all that small print about which function is internal and which is not. At least I don't.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/118#issuecomment-235040340, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwjjVOrKTNayo8tEIWuEUF8ua62Xtks5qZP-xgaJpZM4Iy2UW .
Sorry. That was out of turn and inappropriate and mean . I let some unrelated irritation this afternoon make me a little jumpy and mean without due cause.
I hope you're having a lovely afternoon and I'm excited to collaborate.
On Monday, July 25, 2016, Carter Schonwald [email protected] wrote:
Read the docs. It's under internal
We can't help you with literacy
On Monday, July 25, 2016, Aleksey Khudyakov <[email protected] javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:
If it is internal function it should be given sufficiently scary name or moved into some *.Internal module. Also no one reads all that small print about which function is internal and which is not. At least I don't.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/118#issuecomment-235040340, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwjjVOrKTNayo8tEIWuEUF8ua62Xtks5qZP-xgaJpZM4Iy2UW .
No worries.
Still I think that exporting internal and public API from same module is wrong design. It's too easy to confuse then. Moreover same internal APIs contains stream/unstream & friends which are definitely part of public API since it's not possible to write fusible functions which operate on vector.
So in my opinion they should be either raised to status of public API or safely hidden in some Internal module.
Another point about public/internal API distinction: #70. Someone asked for API which would be marked internal so it makes sense to make it public.
I guess this does nicely line up with the discussion around exposing internals in internal modules of each respective representation. Eg unboxed vectors etc
On Tuesday, July 26, 2016, Aleksey Khudyakov [email protected] wrote:
Another point about public/internal API distinction: #70 https://github.com/haskell/vector/issues/70. Someone asked for API which would be marked internal so it makes sense to make it public.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/haskell/vector/issues/118#issuecomment-235402402, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwhDfwcacvekFfovrdbMstq6j0jqIks5qZnUPgaJpZM4Iy2UW .
Their only common point is that they about exposing/not exposing internal API. Otherwise they're largely independent. So it's probably better to discuss them separately. So we have two questions.
- Should we expose and document streaming API? I think since
Data.Vector.Fusion.*
is exposed that API should be public too. - Should we export constructors for unboxed vectors and how if so? On one hand yes, since some operations couldn't be implemented efficiently without access to the. Counterpoint is their unsafety. For example
(V_2 100 [] []) ! 10
will lead to out of bounds access
But this is consequence of larger problem. Representation of unboxed vectors as structure of arrays pushes borders of language expressivity. (Look at all that CPP!). Adding another instance require a lot of boilerplate and it's not possible/difficult to abstract it away.