ceps icon indicating copy to clipboard operation
ceps copied to clipboard

Threading the state functionally

Open herbelin opened this issue 7 years ago • 14 comments

See rendered.

herbelin avatar Oct 05 '18 17:10 herbelin

Good question. This code is supposed to be called at initialization time of the executable but initialization can only work by side effect. So the effect of this command is to add a field of type 'a to a ref containing the initial value of the collection of summaries. Then, the main loop will start with the initial state located in this reference.

This lets me think that, in order for plugins to add new summaries, we shall have to save the summaries, load the plugin, and read again the summaries.

[Edited to complete unfinished sentence.]

herbelin avatar Oct 06 '18 17:10 herbelin

@herbelin your comment got cut

This lets me think that, in order for plugins to add new summaries, m

SkySkimmer avatar Oct 06 '18 17:10 SkySkimmer

Sorry, this was mistakenly sent before being complete. I edited the comment to complete the sentence.

herbelin avatar Oct 06 '18 17:10 herbelin

Using individual states would be more informative about what a given function is exactly requiring, but there would similarly be a cost.

[...]

Ideally I would like to reach a point between [passing a blob to every function which gets projected into a structured type at the leaves] and [making every function perfectly precise in its input such that the API is atomized and calling basic functions needs to involve some projection path from the surrounding state]. This would be done by making some structure in how we use the state apparent, such that instead of having the state in a function be a adhoc tuple it would be something somehow sensible, even if parts don't get used. eg even we we separate Environ.env in subtypes I think we won't end up passing just the env_inductives part to functions like check_case_info even though that's all it uses. Rather we would end up with some type of "global environment" and some "local environment" (just rels? whatever).

Does that make sense?

But why would we need to get static guarantees?

For one, the more precise the input type is, the easier it becomes to write ml tests (unit tests). Currently we have basically none so it's unclear how far we are from making it possible to have useful ones though. Also as I mention above I hope to have some structure of how we use the input appear which would make it easier to understand the code.

A second approach is to accept that using the state while in the rebuild_method is legitimate.

Using the environment in rebuild is the point, otherwise it could be done in discharge.

Yes, meeting at some time, say in November, would be nice. @SkySkimmer, will you come to Scalp?

I don't know what that is. I can probably make it to a meeting even if I have to make the trip specifically to work on Coq state though ;)

SkySkimmer avatar Oct 08 '18 12:10 SkySkimmer

One thing we could do is to first move from global references to a functionally passed blob, then in a second step refine types (which may be easier to do incrementally than the state passing itself). This approach involves a lot of API churn though.

SkySkimmer avatar Oct 08 '18 12:10 SkySkimmer

I was wondering if we could use ocaml objects and subtyping for this stuff but I have no experience with them so I played around a bit

(* this code compiles (although all it does is assert false) *)

module Main = struct
type env

let is_rel_internal : env -> int -> bool = assert false (* represents accessing the list *)

let is_rel (*: < env:env; .. > -> int -> bool*) = fun e x -> is_rel_internal e#env x
(* doesn't need annotation in the .ml, the type in the comment can be used in the .mli *)

type coercions
let is_coe : < coercions:coercions; .. > -> string -> bool = assert false

(* some pretend adhoc function *)
let foo env x =
  let s = string_of_int x in
  if is_coe env s then Printf.printf "coe"
  else if is_rel env x then Printf.printf "rel"
  else Printf.printf "unknown"


type nametab
type pretype_env = <env:env; coercions:coercions>

type full_env = <pretype_env ; nametab:nametab>

let full_interp : full_env -> string -> unit = assert false

type summaries
let summary_env : summaries -> env = assert false
let summary_coe : summaries -> coercions = assert false
let summary_names : summaries -> nametab = assert false

(* how to define and use the objects *)
let summary_pretype s = object
  method env = summary_env s
  method coercions= summary_coe s
end

let summary_fullenv s = object
  method env = summary_env s
  method coercions= summary_coe s
  method nametab = summary_names s
end
(* duplication seems hard to avoid here. We could try to use classes
   (with [inherit]) but not sure how those work *)

let summary_full_interp s = full_interp (summary_fullenv s) "foo"

let foo = foo

let f s = foo (summary_fullenv s (* no need to explicitly project *) ) 4

let bar (e:pretype_env) x = foo e x

let g s = bar (summary_fullenv s :> pretype_env) 4
(* here we need to use explicit subtyping as bar is not polymorphic
   nonpolymorphic declarations may be easier to write in mlis *)

(* alternative to the duplication above *)
module Alt = struct
let summary_fullenv s = object
  method env = summary_env s
  method coercions= summary_coe s
  method nametab = summary_names s
end

let summary_pretype s = (summary_fullenv s :> pretype_env)
(* not sure if there are any perf issues with doing this *)
end

module ByClass = struct
  class summary_pretype s = object
    method env = summary_env s
    method coercions = summary_coe s
  end

  class summary_fullenv s = object
    inherit summary_pretype s
    method nametab = summary_names s
  end

  let test s = foo (new summary_fullenv s)

end

(* only way to update a field? *)
module ByClass2 = struct
  class summary_pretype s = object
    val env = summary_env s
    val coercions = summary_coe s
    method env = env
    method coercions = coercions
  end

  class summary_fullenv s = object
    inherit summary_pretype s
    val nametab = summary_names s
    method nametab = nametab

    method update_names names = {< nametab = names >}
  end

  let test s = foo (new summary_fullenv s)

  let foo = foo
  let bar = bar
end

end

(* what do mlis look like? *)
module type T = sig
  type env
  type coercions
  type nametab

  val is_rel : <env:env; ..> -> int -> bool
  (* This is a bit irregular compared to the below, maybe [env] should
     be some kind of [env_internal] then [type env = <env:env_internal>]
     and [is_rel : <env; ..> -> int -> bool] *)

  type pretype_env = <env:env; coercions:coercions>

  val foo : <pretype_env; ..> -> int -> unit
  val bar : pretype_env -> int -> unit
  (* Note the [..] difference between polymorphic and nonpolymorphic versions *)

  type full_env = <pretype_env ; nametab:nametab>

  type summaries
  val summary_fullenv : summaries -> full_env

  module ByClass2 : sig
    class summary_pretype :
      summaries -> object method coercions : coercions method env : env end
    class summary_fullenv :
      summaries ->
      object ('a)
        inherit summary_pretype
        method nametab : nametab
        method update_names : nametab -> 'a
      end

    val test : summaries -> int -> unit

    val foo : summary_fullenv -> int -> unit
    val bar : summary_pretype -> int -> unit
    (* the class can also be used as a type, not sure what we would end up doing *)
  end


end

module Main : T = Main (* check that the above module type is correct *)

It doesn't look too bad IMO except updating a field looks tricky, I have no clue about perf implications and I have no experience with these things to tell if other issues are waiting. After taking those into account it doesn't look so attractive though :

SkySkimmer avatar Oct 08 '18 13:10 SkySkimmer

Using object is probably a bad idea, but we might use them just for subtyping and phantom types, as done e.g. in js_of_ocaml.

ppedrot avatar Oct 08 '18 13:10 ppedrot

I like the idea for IMHO feels a bit overkill. Likely, using a PPX for lenses-style programming could be a better option.

ejgallego avatar Oct 08 '18 13:10 ejgallego

As of today we marshal the state to workers using the Marshal module. Objects are nice in OCaml but cannot be marshaled, unless you write your own pickle/unpickle. Doable... Likely to be less efficient.

FTR In Matita we used objects all the way: we needed our own unmarshaler anyway because we do hash-cons names, called NUri.*. This is the root state (empty, but with a survival kit): https://github.com/LPCIC/matita/blob/master/matita/components/ng_kernel/nCic.ml#L128 while this one is (IIRC) the larger one (see the inherit): https://github.com/LPCIC/matita/blob/master/matita/components/grafite_engine/grafiteTypes.ml#L34 this is serialization part, that I don't understand anymore ;-) https://github.com/LPCIC/matita/blob/master/matita/components/ng_library/nCicLibrary.ml#L180 but for the requirements, eg a refresh (read hash-cons) function https://github.com/LPCIC/matita/blob/master/matita/components/ng_library/nCicLibrary.mli#L66

My memory is that it was a really nice exercise to organize the state, and that the API becomes clear, eg the type of the elaborator tells you which state it should be able to read (the one containing coercions) https://github.com/LPCIC/matita/blob/master/matita/components/ng_refiner/nCicRefiner.mli#L23 while, for example, the ones that typechecks and adds an object to the environment takes any state larger than the equivalent of a save_env and returns a state as rich: https://github.com/LPCIC/matita/blob/master/matita/components/ng_library/nCicLibrary.mli#L28

My2C

gares avatar Oct 09 '18 12:10 gares

@SkySkimmer, will you come to Scalp?

I don't know what that is.

A follow-up of GeoCal and LAC (here).

herbelin avatar Oct 09 '18 12:10 herbelin

A parenthesis on removing the calls to Global.env

In this comment on PR coq/coq#8590, I reported on attempts to remove some of the numerous call to Global.env, and, if we agree that removing the ~375 occurrences of Global.env() is an objective (so that we can manage in parallel different processes or different goals living in different states), how to proceed?

As a typical example, I would like to remove this use of Global.env (here in Lemmas.save). What should I do?

herbelin avatar Oct 09 '18 14:10 herbelin

I guess declare_variable should change its type, and return an env

gares avatar Oct 09 '18 15:10 gares

TBH I would look at stuff which comes before vernac/ first when reducing statefulness.

SkySkimmer avatar Oct 09 '18 15:10 SkySkimmer

TBH I would look at stuff which comes before vernac/ first when reducing statefulness.

Indeed, engine and pretyping seem like good candidates.

is an objective (so that we can manage in parallel different processes or different goals living in different states), how to proceed?

I think we can keep removing uses incrementally.

ejgallego avatar Oct 09 '18 15:10 ejgallego