scala3 icon indicating copy to clipboard operation
scala3 copied to clipboard

Go back towards a pure capability system

Open odersky opened this issue 1 year ago • 23 comments

  • Drop @unbox annotation, don't replace with @use.
  • Charge deep capture set of arguments to cv.
  • Multiple fixes to make sure deep capture set computations and reach capability mappings are in sync.
  • Some fixes for handling paths.

The aim of this PR is to go to a stable system for handling reach capabilities in a classical way before embarking on a more effect-system like treatment where we distinguish use later from use now.

odersky avatar Oct 14 '24 13:10 odersky

Soundness issue found:

  import language.experimental.captureChecking

  // ok
  def runOps(ops: List[() => Unit]): Unit =
    ops.foreach(op => op())

  // ok
  def delayedRunOps(ops: List[() => Unit]): () ->{ops*} Unit =
    () => runOps(ops)

  // unsound: impure operation pretended pure
  def delayedRunOps1(ops: List[() => Unit]): () ->{} Unit =
    () =>
      val ops1 = ops
      runOps(ops1)

Linyxus avatar Oct 14 '24 14:10 Linyxus

Another soundness issue found:

import language.experimental.captureChecking

def runOps(ops: List[() => Unit]): Unit =
  ops.foreach(op => op())

def app[T, U](x: T, op: T => U): () ->{op} U =
  () => op(x)

def unsafeRunOps(ops: List[() => Unit]): () ->{} Unit =
  app[List[() ->{ops*} Unit], Unit](ops, runOps)

Linyxus avatar Oct 14 '24 14:10 Linyxus

Good points. The first soundness issue is in the deep capture set computation. We think the deep capture set of ops1 is ops1* since ops1 is a capability. But that's true only if all capabilities in the underlying set are shorter-lived than ops1. In this case, that's not the case, sothe deep capture set of ops1 should be ops*, not ops1*.

odersky avatar Oct 14 '24 14:10 odersky

The second unsoundness example has the same root issue.

odersky avatar Oct 14 '24 14:10 odersky

How is the second one sharing the root cause with the first? My idea behind that example was that given (x:T) where T is a type parameter with no upper bound we can show {x*} <: {}. Then this can be broken by instantiating T.

Linyxus avatar Oct 14 '24 15:10 Linyxus

@Linyxus Both examples are fixed by the same change to dcs computation. For the second example we get now:

-- [E007] Type Mismatch Error: delayedRunops2.scala:10:35 ----------------------
10 |  app[List[() ->{ops*} Unit], Unit](ops, runOps) // error
   |  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |  Found:    () ->{ops*} Unit
   |  Required: () -> Unit
   |
   | longer explanation available when compiling with `-explain

odersky avatar Oct 14 '24 15:10 odersky

I see that in reachCanSubsumeDcs a special case is made for cap, but cap might well represent something outlives the reach. For instance, the following compiles again:

import language.experimental.captureChecking

def runOps(ops1: List[() => Unit]): Unit =
  ops1.foreach(op => op())

def app[T, U](x: T, op: T => U): () ->{op} U =
  () => op(x)

def unsafeRunOps(ops: List[() => Unit]): () ->{} Unit =
  app[List[() => Unit], Unit](ops, runOps)

Linyxus avatar Oct 14 '24 16:10 Linyxus

Also, the error in the second example depends on the fact that runOps is eta-expanded when passed to app on the last line. See the tree:

    def unsafeRunOps(ops: List[box () => Unit]): () -> Unit =
      app[List[box () ->{ops*} Unit], Unit](ops,
        {
          def $anonfun(ops1: List[box () ->{ops*} Unit]^?): Unit = runOps(ops1)
          closure($anonfun)
        }
      )

When marking-free ops1 inside $anonfun, its dcs contains ops* now.

But this does not really fix the problem. For instance, this can be by passed by wrapping runOps inside a class, so that no eta-expansion happens:

import language.experimental.captureChecking

def runOps(ops: List[() => Unit]): Unit =
  ops.foreach(op => op())

trait Consumer[-T]:
  def take(x: T): Unit

class OpsRunner extends Consumer[List[() => Unit]]:
  def take(x: List[() => Unit]) = runOps(x)

def app[T](x: T, op: Consumer[T]^): () ->{op} Unit =
  () => op.take(x)

def unsafeRunOps(ops: List[() => Unit]): () ->{} Unit =
  val x: Consumer[List[() ->{ops*} Unit]]^{} = new OpsRunner
  app[List[() ->{ops*} Unit]](ops, x)

The semantics stays the same, yet we bypass the capture checker.

Linyxus avatar Oct 14 '24 16:10 Linyxus

IMO, the root cause of the second issue is in:

def app[T, U](x: T, op: T => U): () ->{op} U =
  () => op(x)

The closure returned from this function now only captures op, yet it should capture both op and x*. When typing the body of the app function, we assumed that {x*} <: {}, since x's type is T whose upper bound is Any (and therefore elide x* from the capture set). In other words, we have {x*} <: dcs(T) and at this point we believed dcs(T) is empty. But, this contract is broken after instantiating T to, say, List[() ->{ops*} Unit].

Linyxus avatar Oct 14 '24 16:10 Linyxus

I see that in reachCanSubsumeDcs a special case is made for cap, but cap might well represent something outlives the reach. For instance, the following compiles again:

import language.experimental.captureChecking

def runOps(ops1: List[() => Unit]): Unit =
  ops1.foreach(op => op())

def app[T, U](x: T, op: T => U): () ->{op} U =
  () => op(x)

def unsafeRunOps(ops: List[() => Unit]): () ->{} Unit =
  app[List[() => Unit], Unit](ops, runOps)

In a similar manner, the first example bypasses the capture checker if we do:

import language.experimental.captureChecking

// ok
def runOps(ops: List[() => Unit]): Unit =
  ops.foreach(op => op())

// ok
def delayedRunOps(ops: List[() => Unit]): () ->{ops*} Unit =
  () => runOps(ops)

// unsound: impure operation pretended pure
def delayedRunOps1(ops: List[() => Unit]): () ->{} Unit =
  () =>
    val ops1: List[() => Unit] = ops
    runOps(ops1)

The change is on the second but last line: we added a type ascription to the val binding.

Linyxus avatar Oct 14 '24 16:10 Linyxus

he closure returned from this function now only captures op, yet it should capture both op and x*. When typing the body of the app function, we assumed that {x*} <: {}, since x's type is T whose upper bound is Any (and therefore elide x* from the capture set). In other words, we have {x*} <: dcs(T) and at this point we believed dcs(T) is empty. But, this contract is broken after instantiating T to, say, List[() ->{ops*} Unit].

Yes, but this will make a total mess of parametric reasoning. I fear this would be far too restrictive if we have to assume cap for every time variable reference. We need something better. But what?

odersky avatar Oct 14 '24 18:10 odersky

I think the kernel of the first issue:

def delayedRunOps1(ops: List[() => Unit]): () ->{} Unit =
  () =>
    val ops1: List[() ->{ops*} Unit] = ops
    runOps(ops1)

is that the reach capabilities term parameters and local definitions should be treated differently. What happened when we type runOps(ops1) is that {ops1} is charged to the environment, then gets dropped since it is local. If it was a reach capability of a parameter reference, e.g. ops*, dropping it when it goes out of the defining scope is fine: we get compensated when applying the function, where the dcs of the argument will be charged.

But it is different for local definitions. There is no compensation for them. When a reach capability of them, like ops1*, goes out of scope. We have to do something to approximate. Throwing it away is unsound.

One possibility is to widen it, similar to what is already implemented: we see that the type of ops1 is List[() ->{ops*} Unit], so we widen {ops1*} to {ops*}.

Linyxus avatar Oct 14 '24 21:10 Linyxus

And when the dcs of the type of ops1 contains cap, we should error, like in the revised snippet:

def delayedRunOps1(ops: List[() => Unit]): () ->{} Unit =
  () =>
    val ops1: List[() => Unit] = ops
    runOps(ops1)

We can understand it from a Capless perspective. List[() => Unit] is Exists C. List[() ->{C^} Unit]:

def delayedRunOps1(ops: List[() => Unit]): () ->{} Unit =
  () =>
    val (ops1*, ops1) = (ops : Exists C. List[() ->{C^} Unit])
      // this is an existential unpack; ops1* is the capture variable, and ops1 the term variable.
    runOps(ops1)

Then there is no way to approximate ops1*: we've already lose track. It represents "some capabilities" that exists. But we get no idea what they are.

Linyxus avatar Oct 14 '24 21:10 Linyxus

If we instead adopt the above mechanism in place of reachCanSubsumeDcs, it could also save us from the nightmare of parametric reasoning, since very often given x: X where X is a type parameter, x is a term parameter. This means that in most cases there is no need to widen x* (we either drop it or just keep it). Although we consider dcs(X) = {cap} which is imprecise and even invalid to widen into, we will just refrain from widening x* to dcs(X).

Linyxus avatar Oct 14 '24 21:10 Linyxus

I see how this would fix the delayedRunops1 example. But what about the example with app? That looks like the harder problem to me.

odersky avatar Oct 14 '24 23:10 odersky

For app, since now we assumed dcs(T) can be cap, it will not typecheck. It only typechecks if we do:

def app[T, U](x: T, op: T => U): () ->{x*,op} U =
  () => op(x)

and then unsafeRunOps will not typecheck either.

Linyxus avatar Oct 14 '24 23:10 Linyxus

Not sure. dcs[T] being cap looks very problematic. I was hoping we could avoid that, but it seems not.

odersky avatar Oct 14 '24 23:10 odersky

It could be possible to avoid that, if we either:

  • charge dcs(T) for any x[T], so that it is compensated;
  • or restrict the instances of type parameters to have an empty dcs, so that the contrat is not broken

Linyxus avatar Oct 14 '24 23:10 Linyxus

Here's a modified version of app which combines both of our difficulties: dcs of type variables and local definitions:

def app2[T, U](x: T, op: T => U): () ->{op} U =
  () => 
     def y: T = x
     op(y)

We could solve this by allowing regular type variables in capture sets. E.g. a sound version of app2 could be:

def app2[T, U](x: T, op: T => U): () ->{T^, op} U =
  () => 
     def y: T = x
     op(y)

Looks a bit like like a version of cc that we had early on.

odersky avatar Oct 15 '24 10:10 odersky

The idea is that dcs(T) = T^ instead of widening to underlying dcs of T's bound.

odersky avatar Oct 15 '24 10:10 odersky

When substituting U for T, we replace every occurrence of T^ in a capture set by dcs(U). This includes capture set variables as a special case since dcs(CapSet^{a, b, c}) is {a, b, c}.

odersky avatar Oct 15 '24 10:10 odersky

Another simplification could be to admit x* only if x is a parameter. That would avoid the awkwardness of having to avoid capture sets of locally defined vals.

odersky avatar Oct 15 '24 10:10 odersky

Given that in most cases, we write the following form:

def f[T](x: T): ...

where type parameters are followed by term ones and become the type of the term parameters, would it be possible to simply use x* in the places that T^ are used? So there is no need to introduce T^, which is the exact problem boxes are trying to avoid.

Linyxus avatar Oct 15 '24 13:10 Linyxus

Superseded by #21863

odersky avatar Nov 06 '24 13:11 odersky

Superseded by #21863

odersky avatar Nov 06 '24 16:11 odersky