RFCs icon indicating copy to clipboard operation
RFCs copied to clipboard

`include functor`

Open ccasin opened this issue 1 year ago • 45 comments

This is a proposal for a new structure and signature item form, include functor.

Rendered version

(Thanks to @OlivierNicole and @goldfirere for help preparing this RFC)

ccasin avatar Jul 10 '24 15:07 ccasin

In a similar way could we also add a module functor E = F

This might be useful in some other cases and does not seem much work in addition to this feature. The use case that I have in mind would be linked to modular implicits :

module type Eq = sig
    type t
    val eq : t -> t -> bool
end

module type Ord = sig
    type t
    val cmp : t -> t -> int
    module E : Eq with type t = t
end

module F (X : sig type t val ord : t -> t -> int) : Ord with type t = X.t = body

module OInt = struct
    type t = int
    let cmp = Int.compare
    module functor E = F
end

This allows for the OInt module (an instance of Ord) to automatically implement an equality that respects ordering.

samsa1 avatar Jul 11 '24 07:07 samsa1

In a similar way could we also add a module functor E = F

Indeed. I agree this is a reasonable feature and not much more work. It has occasionally been requested by users of include functor at Jane Street. We've held off on implementing it, but not for any particularly principled reason (mainly: I think it will get a little less use, and I think the meaning of the syntax is slightly less intuitive) but I'm very happy to add it if there is consensus it is desirable.

ccasin avatar Jul 11 '24 12:07 ccasin

This is a pattern that is actually quite common in the flambda2 code base, in particular

module T = struct
  module M = struct
    type t = ...
    let compare = ...
  end
  include M
  module Set = Set.Make(M)
end

Which would allow to get rid of that spurious M module

module T = struct
  type t = ...
  let compare = ...
  module functor Set = Set.Make
end

One such example in the upstream compiler code base: https://github.com/ocaml/ocaml/blob/0d18e1287e49e92cf37824559cda5c09a2438b32/typing/shape.ml#L103-L135

chambart avatar Jul 11 '24 12:07 chambart

Have you considered the alternative of giving a name (say "_") to the current module prefix (i.e. "the module up to this point"), so that instead of

module M = struct
  type t = ...
  [@@deriving compare, sexp]
  include functor Comparable.Make
end

you'd write

module M = struct
  type t = ...
  [@@deriving compare, sexp]
  include Comparable.Make(_)
end

?

With that alternative design it'd be possible to refer to the module prefix in arbitrary module expressions rather than always passing it as the argument of a single-parameter functor, so you could also write things like:

include F(_)(X)

and

module type of _

and

open F(_)

and

module E = F(_)

and

include S with module type T = _

and perhaps even

type t = F(_).t

etc.

yallop avatar Jul 11 '24 13:07 yallop

whose parameter can be "filled in" with the previous contents of the module

Just to be sure : do the components used to "fill in" the parameter need to be defined from the current structure, or do they only need to be visible at this point (coming from a surrounding structure or from some open)?

alainfrisch avatar Jul 11 '24 13:07 alainfrisch

We're not very fond of the underscore, so @Ekdohibs suggests module as of then and @chambart proposes virtual module downto begin

lthls avatar Jul 11 '24 13:07 lthls

A better argument against using underscore to talk about the beginning of the module is that current work on modular implicits. We are currently thinking of defining _ as an arbitrary module expression that should be inferred but this would be incompatible with the proposal of @yallop. However I think that his idea is more expressive and should be discussed but with another name in mind.

samsa1 avatar Jul 11 '24 13:07 samsa1

Just to be sure : do the components used to "fill in" the parameter need to be defined from the current structure, or do they only need to be visible at this point (coming from a surrounding structure or from some open)?

They need to be defined from the current structure. One could imagine doing either thing, but this has a nice clear rule, makes it less likely refactorings will cause errors due to what is in scope for include functor changing, and simplifies the implementation.

ccasin avatar Jul 11 '24 14:07 ccasin

For syntax, we could use use just plain old module. Examples:

include F(module)
module M = F(module)

Or we could be even bolder and use a symbol:

include F(^^)
module M = F(^^)

I think any syntax should not be available in paths.

goldfirere avatar Jul 11 '24 15:07 goldfirere

Personally, I dislike both the:

module functor E = F

form and mechanisms based on a name for the contents of the current module, and would prefer to push people towards include functor instead. That is because I think it is better to have a name for this interface:

sig
  type t
  module Set : Set.S with type elt = t
end

and use that, rather than having each user choose the name for their set module.

include functor supports that style very naturally. In the Set module you can define:

module type MixS = (X : OrderedType) -> sig module Set : S with type elt = X.t end
module Mix : MixS

and then you can write:

module Foo : sig
  type t
  include functor Set.MixS
end = struct
  type t = [...]
  let compare = [...]
  include functor Set.Mix
end

lpw25 avatar Jul 15 '24 09:07 lpw25

If I understand correctly, this use of include functor in signatures amounts to treating functor types as parameterized signatures. It certainly makes the example look elegant, but it doesn't really seem harmonious with the way that module types work in the rest of the language.

yallop avatar Jul 23 '24 07:07 yallop

That is one way to look at it and it does look different from other uses of module types in that perspective. An alternative though is to consider include S to mean "extend the module type as it would be if it had include M done to it for some unknown M : S, and then treat include functor S in the same way: extend the module type as it would be if it had include functor M done to it for some unknown M : S. I think that is a quite natural way for users to think about it, and there isn't any other obvious way to interpret include functor S in a signature.

lpw25 avatar Jul 23 '24 12:07 lpw25

If I understand correctly, include functor for modules (not for signatures) is needed when the functor does not re-export its parameter. I.e, the pattern

module F = functor (Y:S) -> struct (* ... *) end  
module Foo = struct
   (* code *)
   include functor F
end

could be replaced by changing F to re-export its argument and putting the application at top-level :

module F = functor (Y:S) -> struct include Y (* ... *) end
module Foo = F(struct 
   (* code *)
end)

Overall, could the role of include functor be taken by having a special mechanism to apply and include argument in the result ? A downside I can see is that it puts the functor application at the beginning of the struct, which has not the same flow as putting include functor at the relevant point inside the structure.

clementblaudeau avatar Aug 20 '24 15:08 clementblaudeau

If I understand correctly, include functor for modules (not for signatures) is needed when the functor does not re-export its parameter. I.e, the pattern

module F = functor (Y:S) -> struct (* ... *) end  
module Foo = struct
   (* code *)
   include functor F
end

could be replaced by changing F to re-export its argument and putting the application at top-level :

module F = functor (Y:S) -> struct include Y (* ... *) end
module Foo = F(struct 
   (* code *)
end)

Overall, could the role of include functor be taken by having a special mechanism to apply and include argument in the result ? A downside I can see is that it puts the functor application at the beginning of the struct, which has not the same flow as putting include functor at the relevant point inside the structure.

I think this is a reasonable idea, but doesn't quite offer the full convenience of include functor. In this example from the RFC:

module M = struct
  module T = struct
    type t = ...
    [@@deriving compare, sexp]
  end

  include T
  include Comparable.Make(T)
end

I think your proposal saves the include T, but not the need to define T in the first place when its only purpose is to be a parameter.

ccasin avatar Aug 22 '24 12:08 ccasin

It can save T by doing a functor call directly on the unnamed structure. To be more precise: What I had in mind was some new construct to mark functor applications where the functor parameter should be included in the result of the application, something like F [reexport] (M) which is syntactic sugar for

struct
  open (struct module X = M end)
  include X
  include F(X)
end

Then the example of the RFC would become:

module M = Comparable.Make [reexport] (struct
  type t = ...
  [@@ deriving compare, sexp]
end)

I think it provides more or less the same functionality. An upside is that it does not depend on a specific position in the code like include functor does, which I think might be a bit brittle. A downside is that it puts the functor application at the top, not in the flow of the definition of the module like include functor does. I'm not sure how it would support patterns where there are several include functors separated by other bindings, like :

module M = struct
  type t = ...
  include functor F
  let x = 42
  include functor G 
end

clementblaudeau avatar Aug 22 '24 15:08 clementblaudeau

Actually a key issue with the re-export pattern I was suggesting is that the functor can only re-export the field indicated in its parameter signature, which seems much more restricted than include functor, for which all fields of the current structure are kept.

clementblaudeau avatar Sep 09 '24 09:09 clementblaudeau

This has sat for a while and there is some unresolved debate about the best design. @Octachron, could I request that the language committee take this RFC up? Thanks!

ccasin avatar Feb 21 '25 14:02 ccasin

Having the committee relaunch the debate sounds sensible to me, I will keep you updated once we have a shepherd.

Octachron avatar Feb 21 '25 14:02 Octachron

The proposal suggests to allow include functor FT in signatures, where FT is the type/signature of a functor, but it also suggests that naming functor types is uncommon (this is also my experience), and that the form include functor (module type of F) may be used for a functor F.

Question: have you considered having include functor F in signatures, and maybe something like include functor type FT in addition for the more complex, less common form?

gasche avatar Feb 21 '25 15:02 gasche

Hello,

Here is my two cents.

Regarding the proposed mechanism, I believe that it is clearly useful. I have commonly felt the need to say "please apply the functor Foo to the types and values that I have defined above".

Regarding the concrete syntax, I rather dislike include functor Foo, because it does not make intuitive sense; include is normally applied to a structure, not to a functor. As Jeremy pointed out on July 11, what this construct does is really include Foo(this), where this denotes the content of the (as yet incomplete) current module, and include is the usual include construct.

This suggests that perhaps the new feature that is really needed is not include functor, but is actually this (or whatever concrete name one chooses for this concept). This construct seems in fact more powerful than the proposed construct, as it allows writing (for example) include Foo(A)(this)(C), or include Foo(struct include this let x = 0 end), whereas include functor cannot easily express these forms, I believe.

fpottier avatar Feb 21 '25 15:02 fpottier

I find that @yallop's suggestion of naming the "current module so far" has merit, at least when equipped with the decent syntax module proposed by @goldfirere. In modules, one would write include F(module) for what the RFC proposes as include functor F. But how would that work in signature? Would include functor FT be replaced by include module type of FT(module), where module would be understood as "some module whose signature is the current signature so far"? (The current signature is, of course, module type of module.)

@lpw25 if I understand correctly, your argument is that you prefer to extend modules by inclusion, rather than by naming new submodules, because this tends to encourage a coherent style where the same module names are reused consistently. So you like include F(module) better than module Sub = F(module), and you appreciate that the less expressive include functor F syntax can express only the former and not the latter.

From a distance, I'm not convinced:

  1. Extension-by-inclusion can be recommended in style guidelines, it is not clear to me that it is such an important idea that it needs to be enshrined in the language constructs -- as long as it is easy to express, which is the need here.
  2. There are other mechanisms to encourage coherent naming styles for submodules, for example transparent ascription. Your example becomes even more natural to me if you name the HasSet signature instead of the MixS signature that produces this interface, and you ascribe it to your module.

On the other hand, one could argue that the following ought to work, which has comparable expressivity to @yallop's proposal, and could be presented as easier to understand than a magical module keyword:

include functor (functor Self => F(Self)(X))

(This doesn't quite work today because there is no syntax nor bidirectional-propagation mechanism to have the signature of Self inferred from the context. But it sounds doable from a distance.)

gasche avatar Feb 21 '25 15:02 gasche

@fpottier: include Foo(struct include this let x = 0 end): wait a minute, clearly this is the empty module in this context, right? You probably meant this/2 ;-)

gasche avatar Feb 21 '25 15:02 gasche

@fpottier: include Foo(struct include this let x = 0 end): wait a minute, clearly this is the empty module in this context, right? You probably meant this/2 ;-)

In fact, rather than a keyword for the name, we might allow naming the current module as we do for objects... struct (foobar) type t = ... let x = ... include functor F(foobar) end, which would be more consistent.

didierremy avatar Feb 21 '25 16:02 didierremy

But how would that work in signature?

What problem do you see with signatures? Can't we just use this (or module or whatever concrete syntax you prefer) to designate "the current (incomplete) signature"?

fpottier avatar Feb 21 '25 16:02 fpottier

@fpottier: include Foo(struct include this let x = 0 end): wait a minute, clearly this is the empty module in this context, right? You probably meant this/2 ;-)

You got me. Indeed, in the presence of nested modules, there might conceivably be a need for multiple levels of this.

It is a bit unsettling that this is not synonymous with struct include this end. I believe that currently if M is a structure then M is synonymous with struct include M end (I mean, when used inside a module expression, not a module type).

fpottier avatar Feb 21 '25 16:02 fpottier

In fact, rather than a keyword for the name, we might allow naming the current module

This is an intriguing idea, but I don't think that it is acceptable... the name of the current module would look like a variable, as its name is chosen by the user, but it is not a variable in the usual sense, since its meaning changes every time a new definition is made in the current module.

fpottier avatar Feb 21 '25 16:02 fpottier

This is an intriguing idea, but I don't think that it is acceptable... the name of the current module would look like a variable, as its name is chosen by the user, but it is not a variable in the usual sense, since its meaning changes every time a new definition is made in the current module.

Sure the meaning changes over time, but this is exactly the same problem with this as a keyword: it suggests the current module .... at the end of the struct as one is used to with objects, but it only means up to the current point. If you wish to avoid the ambiguity, the keyword should not be this but the-current-value-of-this.

didierremy avatar Feb 21 '25 16:02 didierremy

I also thought of struct (Self) ... end as for objects, and I don't see a big issue with the fact that the name refers to the fraction of the module that has already been defined above. But I'm not sure this construction works in signature context: in class type foo = object ('s) ... end, 's is a class type, the natural transposition is to have sig (Self) ... end have a module type Self, but this is not what we want for include functor Foo in signatures, which must translate to include module type of Foo(Self) for a module Self.

gasche avatar Feb 21 '25 17:02 gasche

Question: have you considered having include functor F in signatures, and maybe something like include functor type FT in addition for the more complex, less common form?

Do I understand correctly that your idea is for include functor F to implicitly take a module type of, so that include functor F means the same as include functor type (module type of F)?

I think that's a reasonable idea. I tend to prefer the current form because I think it encourages people to give a real name to these functor types rather than relying on module type of. I expect this to be better behaved in general, considering the various issues with that construct (e.g., ocaml/ocaml#13765 to pick a random recent unrelated example). But I could be convinced!

ccasin avatar Feb 21 '25 17:02 ccasin

(Slightly cheeky suggestion.) If folks are worried that module changes meaning over time (I'm not), we could change it to be as above:

include F(as above)
module M = F(as above)

Note that as is already a keyword, unused in module syntax. The above would be a required next token, though we now have unbounded syntactic space, so we could as outer above or as outer outer above or some such. We could also require a word module at the beginning:

include F(module as above)

goldfirere avatar Feb 21 '25 18:02 goldfirere