analyzer icon indicating copy to clipboard operation
analyzer copied to clipboard

Migrate from CIL's `Pretty` to OCaml's `Format`

Open sim642 opened this issue 3 years ago • 3 comments

In #210 it came up multiple times that OCaml's standard Format module could replace Pretty from CIL with additional benefits:

  1. It's maintained in OCaml's standard library while Pretty isn't properly maintained (maybe we should do this migration even in our CIL fork eventually).
  2. It has much more features for producing pretty output.
  3. It's supposed to be faster by not constructing Pretty.doc structures but writing to a formatter directly (a la how Printable.printXml currently works).
  4. It's supposed to be faster by using GADTs for internal implementation instead of Obj (https://github.com/goblint/analyzer/issues/210#issuecomment-831395278).
  5. ppx_deriving.std has deriving show that also derives a pretty-printer in addition to a simple show (https://github.com/goblint/analyzer/issues/210#issuecomment-831395278).
  6. It allows custom semantic tags, e.g. for colors, and their behavior can depend on where the output is going (to the terminal or to a string).

TODO

  • [ ] Replace Printable.pretty with val pp: Format.formatter -> t -> unit. This is the de facto standard naming for these pretty printer and deriving show also derives this signature.
  • [ ] Rewrite every implementation of pretty as pp.
  • [ ] Rewrite every Pretty.printf with Format.printf by using pp instead of pretty.
  • [ ] Rewrite tracing system to use Format instead of Pretty.
  • [ ] Rewrite every trace by using pp instead of pretty.
  • [ ] Implement pp for CIL types, e.g. pp_exp to replace d_exp. CIL has a lot of Pretty-based code to pretty-print everything (including entire programs). We don't need all of it though unless we also decide to migrate CIL itself.

sim642 avatar May 04 '21 08:05 sim642

maybe we should do this migration even in our CIL fork eventually

I think we should probably do this at the same time, because otherwise we will be relying on Pretty.sprint for the output of CIL types to strings which makes for strange code:

Pretty.sprint ~width:80 (Cil.d_exp () exp)

michael-schwarz avatar May 04 '21 08:05 michael-schwarz

For an initial migration in Goblint, we could still do that, but maybe with ~width:max_int so Pretty doesn't add line-wrapping newlines. I think the CIL pretty printers for expressions and other smaller things don't actually make use of Pretty.break and other fancy features anyway. it's just that CIL has lots of Pretty code for statements, functions and entire programs, which we don't really need, but would need to be rewritten for a complete CIL migration. It of course would be good for keeping our CIL fork somewhat modern, but maybe not so high priority for us.

But of course for general use of Format it's a very bad style to construct intermediates as strings, because it doesn't allow Format to decide about nice line breaks and the inner forced string doesn't know how indented it is or what the outmost width is.

The general principle should be that the pretty-printing width is only determined at the outermost level (e.g. where printf actually happens) and everything else automagically gets adapted to that. It would allow for nice things like having the pretty printing be based on actual terminal width instead of enforcing an annoying 80 character width with tons of newlines even when you look at it in a wider terminal. For non-terminal output it could be unlimited (max_int) or configurable.

sim642 avatar May 04 '21 08:05 sim642

Here's an interesting idea, which might make migration easy, but I don't know if it's possible:

Define our own Pretty module with type doc = Format.formatter -> unit and a matching dprintf to be a complete drop-in implementation, except for the CIL stuff.

Probably not due to some %a typing reason. Pretty has them of type unit -> t -> doc, but Format has them of type formatter -> t -> unit. The above suggestion would make our definitions of type unit -> t -> formatter -> unit, which wouldn't work for Format. I guess a drop-in implementation would only work if a custom fprintf-like thing is defined to do some argument flipping before using Format.fprintf.


I tried really hard to achieve this, but I'm fairly sure now that such drop-in dprintf cannot be done. Even though OCaml's format stuff is super involved (e.g. this GADT with 12 type parameters) and some of the internals are exposed via CamlInternal* modules, the key parts that actually call the function for %a are in the implementation of Format etc but not exposed by their signatures.

sim642 avatar May 04 '21 12:05 sim642