unison pretty-printing most types is really slow (at least in watch expressions)

If I put a Text or a Nat value in a watch expression, then it renders to the screen reasonably quickly when I save the file. But for even a trivial non-builtin type like Optional Text, it takes multiple seconds. Here are some examples with ucm's timing output:

  ~/code/unison/tmp.u changed.
  
  Now evaluating any watch expressions (lines starting with `>`)... Ctrl+C cancels.


    1 | > "hi"
          ⧩
          "hi"

UnisonFileChanged: 831 ms (cpu), 284 ms (system)
cloud/testing> 

  ✅
  
  ~/code/unison/tmp.u changed.
  
  Now evaluating any watch expressions (lines starting with `>`)... Ctrl+C cancels.


    1 | > Some "hi"
          ⧩
          Just "hi"

UnisonFileChanged: 3.55 s (cpu), 3.55 s (system)

  ~/code/unison/tmp.u changed.
  
  Now evaluating any watch expressions (lines starting with `>`)... Ctrl+C cancels.


    1 | > 3
          ⧩
          3

UnisonFileChanged: 64.6 ms (cpu), 64.5 ms (system)
cloud/testing> 

  ✅
  
  ~/code/unison/tmp.u changed.
  
  Now evaluating any watch expressions (lines starting with `>`)... Ctrl+C cancels.


    1 | > Some 3
          ⧩
          Just 3

UnisonFileChanged: 3.61 s (cpu), 3.60 s (system)

Aug 11 '23 13:08 ceedubs

I just used @runarorama's timers library to run a little experiment related to this. Here is my not-very-scientific benchmark code:

main = do
  n = 1
  Scope.run do
    i = Scope.ref 0
    Duration.toText <| timeN n do
      ignore (Debug.toText (Some ("hello", Ref.read i)))
      Ref.modify i Nat.increment

I added the mutability in, because I wasn't sure whether Debug.toText results were cached in the runtime.

When n is set to 1 (so this code is only executed a single time), it takes between 3.5 and 4 seconds on my machine. But if I crank n up to 10000, then the average time per-operation drops to ~12 microseconds. I could see a runtime optimizing repeated paths, but to me this drastic of a difference suggests to me that each run invocation incurs some sort of one-time cost for Debug.toText calls -- perhaps initializing the pretty-printing environment. If that's true then I wonder if ucm could avoid that initialization if the codebase hasn't changed since the last run or something.

Aug 14 '23 13:08 ceedubs

I wonder if it's spending all this time building up the naming environment for the pretty-printer

Aug 14 '23 14:08 runarorama

One thing is that the pretty-print environment can look for names in two places - the local names of the current namespace, and the root namespace. If it has to consult the root namespace, that can be extremely slow the first time, since the root names aren't usually forced yet.

The other possibility is this particular code path is just building up the combined names ahead of time, instead of using <> on the two PrettyPrintEnv (one for local, one for root).

Aug 14 '23 18:08 pchiusano

I was running this example in a project that had all of the relevant names available, so it sounds more likely that this code path is building up the combined names or just loading the pretty print environment for this project takes a few seconds.

Aug 15 '23 09:08 ceedubs

The situation has improved on this one, but not to the point that I think that this ticket should be closed out.

The difference seems to be that now only the project PPE is loaded as opposed to a global PPE. If perform the same experiments in a small project then I see results in well under a second. But if I am in a project that has a decent amount of code and lots of dependencies, then > Some "hi" still takes over 2 seconds to render.

Jun 18 '24 13:06 ceedubs