Idris2 icon indicating copy to clipboard operation
Idris2 copied to clipboard

contrib Data.Linear.Array is unsafe

Open KyleDavidE opened this issue 4 years ago • 15 comments

You can easily get a linear vector outside of it's linear context like so:

unsafeMakeLinArray : (size : Int) -> LinArray t
unsafeMakeLinArray size = newArray size (\x => x)

The only way around this that I can think of is requiring the callback function to return the vector in a tuple with the result

KyleDavidE avatar Aug 28 '20 19:08 KyleDavidE

A possible fix is to add an extra parameter à la Control.Monad.ST and to give newArray a rank 2 type so that the mutable array cannot be returned because that would make a variable escape its scope.

Something like:

  newArray : (size : Int) -> (1 _ : forall s. (1 _ : arr s t) -> a) -> a

AFAICT it is then impossible to implement copyArray in a safe manner but we are already using unsafePerformIO in that file so we may as well also do some unsafe coercions to implement the primitives of interest... :D

gallais avatar Aug 28 '20 19:08 gallais

The problem with that is that you could just return a dependent tuple with arr and s, The reason that this trick works in haskell is because the parameter is on both the monad and the variable references.

Re: new primatives. If we take the tuple approach I talked about it inspires a fairly clean way to do this with minimal occurences of unsafePerformIO. Make LinArray carry a %World with it, in which case the only call to unsafePerformIO needs to happen in newArray (this in turn could be made a lot safer by allowing %World to be parameterized)

KyleDavidE avatar Aug 29 '20 03:08 KyleDavidE

I'm not sure that %World is the right solution. Would something like ensure that it cannot leak?

newArray : (size : Int) -> (1 _ : {0 s : Type} -> (1 _ : arr s t) -> Res a (const (arr s t)) -> a

michaelmesser avatar Nov 04 '21 02:11 michaelmesser

I believe withString has the same flaw that newArray has: https://github.com/idris-lang/Idris2/blob/523c0a6d7823d2b9a614d4a30efb52da015f9367/src/Libraries/Data/String/Iterator.idr#L33

michaelmesser avatar Nov 07 '21 09:11 michaelmesser

There a two possible solutions that I can think of:

  • One array per "world":
    • newArray : (size : Int) -> (1 _ : {0 s : Type} -> (1 _ : arr s t) -> Res a (const (arr s t)) -> a
  • Many arrays per "world":
    • withWorld : ({0 s : Type} -> (1 _ : s) -> Res r (const s)) -> r
    • newArray : (1 _ : s) -> (size : Int) -> Res (arr s t) (const s)

There might be even more possible ways. I'm not sure which way is best. Also LPair might be needed to return stuff while preserving linearity in some places to nest worlds within worlds.

michaelmesser avatar Jan 18 '22 02:01 michaelmesser

I'm not sure that %World is the right solution. Would something like ensure that it cannot leak?

newArray : (size : Int) -> (1 _ : {0 s : Type} -> (1 _ : arr s t) -> Res a (const (arr s t)) -> a

FWIW, I used this idea here: https://git.sr.ht/~falsifian/s3d/tree/main/item/src/Data/Linear/SizedArray.idr

Except I left out the s because I couldn't see a way to break it even without it. Just having to return the array at the end the computation foiled my attempts to make it escape.

But my understanding of how all this works is pretty vague, so take it with a grain of salt.

falsifian avatar Jan 18 '22 04:01 falsifian

Why not follow what Linear Haskell does here and add an additional Ur argument?

-- This is the same thing as Control.Monad.Identity, but makes it clearer what
-- the purpose of this is: a unrestricted data type wrapper that allows the value
-- it wraps to be used with unrestricted multiplicity even if the `Ur` wrapper
-- itself has linear multiplicity.
--
-- In Linear Haskell this has to be a separate type because Linear Haskell
-- gives all arguments to a constructor linear multiplicities by default.
data Ur : Type -> Type where
  MkUr : a -> Ur a

newArray : (size : Int) -> (1 _ : (1 _ : arr t) -> Ur a) -> Ur a

The first Ur prevents the linear argument from escaping by requiring the result that is returned to be unrestricted (since anything that is passed into Ur must be unrestricted).

The second Ur makes newArray's linear multiplicity on the function actually usable. Otherwise there's no way to use the result of newArray in an unrestricted way if an argument with an outer linear notation is used.

newArray : (size : Int) -> (1 _ : (1 _ : arr t) -> Ur a) -> Ur a
                            ^ This is the multiplicity I'm talking about
                              It's effectively useless if newArray doesn't return an Ur

See this for example.

newArrayWithUselessOuterLinear : (size : Int) -> (1 _ : (1 _ : arr t) -> Ur a) -> a
newArrayWithUselessOuterLinear = ...

-- If we don't use the outer linear annotation we can reuse newArray's result just fine
nonLinearArgument : ((1 _ : arr t) -> Ur Int)  -> Int
nonLinearArgument f =
    let
        result = newArray0 0 f
    in
        result + result
        
-- But if we actually try to specify f as linear then this won't compile
-- That is we can't actually use the result of newArray in a non-linear way
linearArgument : (1 _ : (1 _ : arr t) -> Ur Int)  -> Int
linearArgument f =
    let
        result = newArray0 0 f
    in
        -- Fails to compile
        result + result

As a general rule of thumb if you want some function with linear arguments to be "terminal," i.e. the last function in a series of linear functions called that then gives us back a result to use elsewhere it should return an Ur.

This is why I'm also confused by size. Without an Ur output size seems overly restrictive (I can't see any reason why the result of size must also be used linearly unless you want to restrict copying of a machine integer which seems way too restrictive). On the other hand, I could see why you want read to be linear (to prevent any copy of an array element), but you'd want a non-linear version as well (i.e. one that returns Ur a and that is guaranteed to perform copying) otherwise you can't use read in newArray because if no copying of an array element is performed, then when the array is freed you have a dangling reference in the result returned by newArray that would be a seg fault (and note that Ur prevents this from happening, but also then prohibits usage of read).

msize and mread side-step this problem because Res has the equivalent of an Ur in its second argument.

changlinli avatar Feb 14 '22 02:02 changlinli

It turns out we recently added a Ur constructor recently: https://github.com/idris-lang/Idris2/blob/main/libs/linear/Data/Linear/Notation.idr#L24

Maybe it should be moved to prelude or base so that we can use it for linear arrays

andrevidela avatar Feb 15 '22 00:02 andrevidela

Or move Data.Linear.Array to the linear lib :angel:

gallais avatar Feb 15 '22 22:02 gallais

I finally figured out how to make it so you can nest calls to newArray and modify the outer array while modifying the inner array. I don't think that is possible with Ur. I don't think what I have is optimal yet. I'm still trying to understand linear types.


data Arr : Type -> Type -> Type where
1 newArray : (size : Int) -> (1 _ : {0 s : Type} -> (1 _ : Arr s t) -> LPair a (Arr s t)) -> a
write : (1 _ : Arr s t) -> Int -> t -> Res Bool (const (Arr s t))

test : ()
test = newArray {t=String} 3 $ \a =>
    let _ # a = write a 0 "foo"
        a = newArray {t=String} 1 $ \b =>
            let _ # a = write a 1 "bar"
                _ # b = write b 0 "foo"
            in  a # b
        _ # a = write a 2 "baz"
    in () # a

michaelmesser avatar Feb 16 '22 05:02 michaelmesser

@michaelmesser Top-level functions with 1 are cool, but inconsistent, see #818

buzden avatar Feb 16 '22 09:02 buzden

@michaelmesser as @buzden points out top-level functions marked with linear multiplicity have their own problems. Are there issues with just keeping both arrays open until the very end? An alternative is to create something like allocBeside from Linear Haskell (https://hackage.haskell.org/package/linear-base-0.1.0/docs/Data-Array-Mutable-Linear.html#v:allocBeside). Both of these do suffer from the drawback that you must deallocate all the arrays at once at the end instead of in the middle, but there are also expressivity problems with requiring that the function passed into newArray return an array. Forcing returning the array at the very end makes a lot of functions that would be otherwise agnostic as to whether the input is linear or not impossible to use in a newArray call.

And in general I think it's nice to have a consistent rule for people to follow when coming across new linear APIs that terminal operations return Ur and intermediate operations return non-Ur.

changlinli avatar Feb 19 '22 05:02 changlinli

My method works without the top level 1, but then you need the to put a 1 on the let:

import Data.Vect
import Data.DPair

data Arr : Type -> Type -> Type where
newArray : (size : Int) -> (1 _ : {0 s : Type} -> (1 _ : Arr s t) -> LPair a (Arr s t)) -> a
write : (1 _ : Arr s t) -> Int -> t -> Res Bool (const (Arr s t))

test : ()
test = newArray {t=String} 3 $ \a =>
    let _ # a = write a 0 "foo"
        1 a = newArray {t=String} 1 $ \b =>
            let _ # a = write a 1 "bar"
                _ # b = write b 0 "foo"
            in  a # b
        _ # a = write a 2 "baz"
    in () # a

I believe my method allows interweaving newArray with other linear things. I think Ur prevents interweaving.

michaelmesser avatar Feb 19 '22 07:02 michaelmesser

I'm a newbie, so I'm sorry if this is a silly question. I don't understand why

f : (_ : 1 a) -> a
f x = x

is well-typed. Rust, for example, would not allow the analogous code, because you're returning a unique object in a context where it is not guaranteed to be unique. That seems to be what's going on in the OP's example.

ellbur avatar May 20 '22 00:05 ellbur

OK, I think I may have answered my own question. Again, I'm sorry for not understanding things.

\x => x is allowed to have type (_ : 1 a) => a) because it is guaranteed to use the argument only once on the condition that the result is only used once. Thus, there is a secret 1 on the result whenever the argument is 1. So when you feed a 1 value to a (_ : 1 a) => a, the result is a 1 value. Ok. Got that now.

Then, the problem with e.g.,

makeArray : Int => t => Array t

is that you want to make the result 1, but you can only do that if there's a 1 argument. There's no way to make a 1 come out of nowhere.

That's why people are talking about a World type. The World type doesn't do anything, but it tricks the compiler into making the result 1. I see that now! Sorry for being slow.

But then I wonder... is there any reason the language couldn't allow you to produce a 1 value out of nowhere? I note you can do this in Rust, which is the only language I have experience with. But I don't know if that would cause some problems in Idris due to some difference between Idris and Rust.

ellbur avatar May 20 '22 00:05 ellbur