RFCs icon indicating copy to clipboard operation
RFCs copied to clipboard

Nominal (abstract) types

Open garrigue opened this issue 6 years ago • 16 comments

A proposal for a uniform handling of abstract types w.r.t. GADTs and injectivity.

Cf. #9042, #5985.

garrigue avatar Nov 04 '19 15:11 garrigue

I have no idea about typing part but using annotations to influence type-checker seems wrong? I would expect this to be part of language syntax proper.

ygrek avatar Nov 21 '19 21:11 ygrek

I'm not sure what the best syntax would be for this one. In practice, the impact is more on exhaustiveness warnings than on typing itself. I.e., this is tracked by the type system, but [@@unboxed] is tracked that way too. (However, [@@nominal] can also generate more equations, so this is not only about warnings.)

An alternative syntax could be type t : new for anonymous nominals, and type t : new M.t for named nominals. It would be nicer to use paths rather than arbitrary strings. In the future, if we can get some notion of namespace in the language, one could require than when defining a fresh nominal type in an implementation, the prefix should be the name of the library.

garrigue avatar Feb 10 '20 16:02 garrigue

Copying my response from the original thread, as it still represents my thoughts on this proposal:

All in all, I'm still not convinced by this proposal as compared to your original external-style proposal. Here you are adding a new component to OCaml's notion of type, whereas before you where just adding a new form of representation. The former seems quite invasive semantically for quite a small benefit.

In particular, I'm think your proposal creates a new question that must be answered for each new type the user creates: "What should its name be?". I don't see a simple systematic way for users to answer that question and that worries me.

In my experience, GADTs are either proper GADTs -- in which case the indices are ordinary types and their compatibility doesn't really matter -- or they are poor-mans indexed types -- in which case some fake types are created to encode some form of data and the external version would be sufficient to control their compatibility. I have not yet seen real examples that need the more invasive version of your proposal.

Additionally I would like to add that I am strongly against using paths instead of arbitrary strings. Using path syntax for things that are not semantically paths and do not behave as paths (e.g. in substitution) is a recipe for confusion.

lpw25 avatar Feb 12 '20 09:02 lpw25

Thanks for putting back the feedback here. A short answer, because I'm on holidays.

  • I have got feedback from Nomadic labs, confirming that the inability to distinguish (non-external) abstract types in GADT indices is a problem for them, and they end up adding some assert false around. So this PR answers a real need, and the external proposal is not sufficient.
  • Feedback at Inria (@Drup among others) is very opposed to using strings. I personally have no strong opinion, but using paths with an intended meaning seems an intermediate option.

This problem has been left as is for more than 5 years, so I would like to eventually get to a solution. Of course, we still have time to refine things :)

garrigue avatar Feb 14 '20 05:02 garrigue

Concerning the syntax, my feeling is that the attribute syntax is closer to what we need here, as this information strengthens the type checker, but does not change the semantics of types. And we have now a number of interpreted attributes on type definitions in the compiler. The drawback is however that attributes are weak to typos. But this is not specific to this case.

garrigue avatar Feb 14 '20 06:02 garrigue

Could we see the examples from Nomadic labs? I think the details of such examples are important to the discussion.

It would also help if someone who thinks that these labels are somehow paths instead of strings could give their reasoning.

Fundamentally, I think this proposal needs to answer this question: how do users answer the question "What should its name be?" every time they create a new time?

This problem has been left as is for more than 5 years, so I would like to eventually get to a solution.

Maybe we should consider more thorough solutions like tracking whether types are freshly created or not. It probably "just" requires using a bunched context for modules to track sharing.

lpw25 avatar Feb 14 '20 10:02 lpw25

Sorry, going to Japan and back and all that delayed me a lot.

I thought a bit about one of the main applications, which is allowing to define witnesses for parametric abstract types, but I could actually find a workaround in that case. Whether it should be described as practical is another question.

Here is the code. The idea is to define a new abstract type Vec.t, which cannot be guaranteed to be injective, and then find a way to make type witnesses that allow to unwrap dynamic values of type int Vec.t Vec.t. Note that if we attempt to define witnesses in the usual way, i.e. type _ ty = Vec : 'a ty -> 'a Vec.t ty, the type definition would not be accepted.

module Vec : sig
  type +'a t
  val make : int -> (int -> 'a) -> 'a t
  val get : 'a t -> int -> 'a
end = struct
  type 'a t = Obj.t array
  let make = Obj.magic Array.init
  let get = Obj.magic Array.get
end

type ('a,'b) vec = Vec of 'a * 'b

type (_,_) ty =
  | Int : (int,int) ty
  | Fun : ('a,'c) ty * ('b,'d) ty -> ('a -> 'b,'c -> 'd) ty
  | Vec : ('a,'b) ty -> ('a Vec.t, ('a,'b) vec) ty

type dyn = Dyn : ('a,_) ty * 'a -> dyn

type (_,_) eq = Refl : ('a,'a) eq

let rec eq_ty : type a b c d.
    (a,c) ty -> (b,d) ty -> ((a,b) eq * (c,d) eq) option =
  fun t1 t2 -> match t1, t2 with
  | Int, Int -> Some (Refl, Refl)
  | Fun (t11, t12), Fun (t21, t22) ->
      begin match eq_ty t11 t21, eq_ty t12 t22 with
      | Some (Refl, Refl), Some (Refl, Refl) -> Some (Refl, Refl)
      | _ -> None
      end
  | Vec t1, Vec t2 ->
      begin match eq_ty t1 t2 with
      | Some (Refl, Refl) -> Some (Refl, Refl)
      | None -> None
      end
  | _ -> None

let undyn : type a b. (a,b) ty -> dyn -> a option =
  fun t1 (Dyn (t2, v)) ->
    match eq_ty t1 t2 with
    | Some (Refl, _) -> Some v
    | None -> None

let v = Vec.make 3 (fun n -> Vec.make n (fun m -> (m*n)))

let d = Dyn (Vec (Vec Int), v)

let Some v' = undyn (Vec (Vec Int)) d

This seems to work for this example, but I'm curious whether this solution is generic enough in practice.

garrigue avatar Apr 06 '20 15:04 garrigue

Here is another version using Oleg's workaround of reifying the equation. I think this is weaker, but I must look for another example.

type (_,_) eq = Refl : ('a,'a) eq

type _ ty =
  | Int : int ty
  | Fun : 'a ty * 'b ty -> ('a -> 'b) ty
  | Vec : 'a ty * ('a Vec.t, 'b) eq -> 'b ty

type dyn = Dyn : 'a ty * 'a -> dyn

let rec eq_ty : type a b. a ty -> b ty -> (a,b) eq option =
  fun t1 t2 -> match t1, t2 with
  | Int, Int -> Some Refl
  | Fun (t11, t12), Fun (t21, t22) ->
      begin match eq_ty t11 t21, eq_ty t12 t22 with
      | Some Refl, Some Refl -> Some Refl
      | _ -> None
      end
  | Vec (t1, Refl), Vec (t2, Refl) ->
      begin match eq_ty t1 t2 with
      | Some Refl -> Some Refl
      | None -> None
      end
  | _ -> None

let undyn : type a. a ty -> dyn -> a option =
  fun t1 (Dyn (t2, v)) ->
    match eq_ty t1 t2 with
    | Some Refl -> Some v
    | None -> None

let v = Vec.make 3 (fun n -> Vec.make n (fun m -> (m*n)))

let int_vec_vec = Vec (Vec (Int,Refl), Refl)

let d = Dyn (int_vec_vec, v)

let Some v' = undyn int_vec_vec d

garrigue avatar Apr 08 '20 17:04 garrigue

Following @gasche 's suggestion, I'm going to use ! for injective type parameters, in an independent prototype. Except if @Drup has a problem with this...

garrigue avatar Apr 24 '20 17:04 garrigue

PR#9500 proposes an alternative solution to this RFC, through injectivity annotations on type parameters. Its implementation is simpler, and it is finer grain, but this does not allow to distinguish abstract types which cannot have the same implementation. Even for injectivity, they are in some way complementary:

  • injectivity annotations can handle the injectivity of non-nominal types
  • nominal types do not only say that a type constructor is injective in all its parameters, but also that one can unify parameters of two nominal types independently of their type constructor (i.e. they are matchable, in Haskell speak IIRC).

garrigue avatar Apr 29 '20 13:04 garrigue

I've read through the ml workshop abstract and the relevant parts of the related Haskell work. My current thoughts on the proposal are as follows.

Firstly, I think that contractiveness is useful to express for non-generative types, especially polymorphic variants and object types. It is also closer to injectivity than generativity, being a property of a type parameter rather than a whole type constructor. I think it would be worth having e.g.:

type ^'a t

for a contractive type definition.

Matchability in OCaml amounts to being a datatype and it is a property of the whole type constructor -- essentially a different arrow in the kind language as in the Haskell work. Personally, I'd propose making that very explicit and having a syntax like:

datatype 'a t

to represent a matchable type. Since all datatype are injective and contractive that would imply ! and ^. Matchability is sufficient for inferring higher-kinded types and does not really have to relate to separability, so I'd avoid making the connection to the "nominal" part of your proposal used for separability.

I remain somewhat unconvinced of the need for expressing the separability of datatypes that have a definition. I can see the need for creating fresh separate types, and I'd support adding syntax like:

type 'a t = external "file handle"

where external "file handle" is a datatype "kind" that is known to be separate from any other kind of data type.

I'm not strongly against having the ability to attach a name to a type in order to force it to be separate, but I still haven't seen convincing examples of the need for it, and remain a bit worried about how a library author is to decide whether to give their type a name or not. Perhaps using attribute syntax to add the feature as you propose and leaving it marked "experimental" in the "extensions" section of the manual for a while would allow people to try it out and find some good examples for it.

lpw25 avatar Mar 08 '21 12:03 lpw25

Thank you for your feedback. This is indeed the alternative approach I had in mind. I am a bit concerned that providing each feature independently would be both painful to track and confusing to the user. On the other hand, it is correct that separability being optional in the proposal, it may make sense to view it as an independent attribute (it might even give rise to a stronger handling, such as saying that a type is not something, for instance not a float). Contractiveness is a bit more dubious, as it is a direct consequence of both matchability and separability, and handling it independently for each argument seems an overkill.

garrigue avatar Mar 09 '21 07:03 garrigue

separability being optional in the proposal, it may make sense to view it as an independent attribute (it might even give rise to a stronger handling, such as saying that a type is not something, for instance not a float)

Yeah, there seems to be quite a large potential design space for ways of controlling and exposing separability between types.

Contractiveness is a bit more dubious, as it is a direct consequence of both matchability and separability, and handling it independently for each argument seems an overkill.

It's true that matchability implies that a type is a datatype which implies that it is contractive. However, not all contractive types are matchable. In particular, polymorphic variants and objects are contractive but not matchable, and I think people using these types are much more likely to run into problems with contractiveness anyway. Certainly I have tried in the past to use functors to construct recursive polymorphic variant types, and found the contractiveness restrictions to be a real pain for that.

I think that as long as declaring a type matchable implicitly declares it contractive (and injective) in all its parameters then the contractiveness annotations (and indeed the injectivity annotations) will only be needed very rarely and will not confuse people too much.

lpw25 avatar Mar 09 '21 10:03 lpw25

Minor point: instead of having a zoo of weird non-variance signs that we always never see in code and are not self-discoverable (good luck googling for ^'a), I think that this is a case where some attribute/extension grammar with human words could make it much easier for the readers.

gasche avatar Mar 09 '21 10:03 gasche

I think that is a reasonable argument for using a (contextual?) keyword rather than a symbol, but I dislike using attributes and extension nodes for genuine language features. Half the point of attributes and extension nodes was to make it clear when you were using an extension or extra-linguistic feature.

lpw25 avatar Mar 09 '21 10:03 lpw25

I agree with @lpw25 , we should not abuse attributes for actual linguistic features. I think it's ok for backend-related stuff, but the use of attributes even for constructor unboxing is already in the gray-area (removing one annotation can break type checking).

We have used several times (cryptic) operator symbols to avoid introducing actual keywords (e.g. for method override, shadowing open). Adding reserved keywords is always tricky, but one could rely on identifiers with some custom syntactic delimiters (e.g. method(override) foo = ... instead of method! foo = ..). Such modifiers would be lexed as LIDENTs, and analysed in the parser (i.e. a clean representation in the Parsetree), yielding syntax errors for unknown modifiers. We could do something like that for type definitions as well.

alainfrisch avatar Mar 09 '21 16:03 alainfrisch