replib icon indicating copy to clipboard operation
replib copied to clipboard

An experimental implementation of Nominal representation based backend for unbound

Open hengchu opened this issue 8 years ago • 0 comments

For this semester's independent studies, I have worked on profiling and implementing a new backend of the unbound library.

My work consists of 3 parts:

  • profiling the current performance of LocallyNameless unbound
  • implementing the Nominal backend
  • comparing the performance of LocallyNameless backend and Nominal backend

The nominal implementation preserves most of the existing interface of unbound, although all of the typeclasses used are redefined so that nominal unbound is not constrained by the existing interface when it does not apply to nominal unbound.

This implementation uses the same generic programming techniques to provide support for user defined datatypes of language constructs. For me, understanding how generic programming with overridable behavior was a non-trivial part of the project. I found the RepLib paper and the extended version of SYB to be extremely helpful.

As part of the journey on understanding generic programming, I also briefly compared the generic programming interface exposed by RepLib and GHC.Generics, and I found the combinators exposed in Generics.RepLib.Aux to be invaluable when traversing data structures generically. I also became more aware of the benefits of existential types while trying to understand how RepLib represents heterogeneous lists, and this helped me understand the materials presented in PFPL better.

For the semantics of the nominal interface, I referred to the Binders Unbound paper and recreated the Rec, Rebind and Embed pattern type constructors and the Bind binder type constructor. A significant effort in the implementation was devoted to the Alpha typeclass and its instantiation for each of the type constructors mentioned above. I have tested the behavior of these implementations with a simply typed lambda calculus extended with mutually recursive let expressions, but I have not yet produced any formal proofs of their correctness. The Subst typeclass and its generic implementation provides capture-avoiding substitution.

The idioms for working with locally nameless unbound carry over to nominal unbound, since minimizing the effort between switching these two implementations was also part of the goal.

In terms of exploring the design space of DSL for binding specifications, I have also studied FreshML, Cαml and FreshLib. These papers were really helpful for observing what difficulties each design choice face and where to balance the power/weight ratio.

In terms of the performance of nominal unbound, currently it is up to 4x slower than the locally nameless implementation on large terms (terms with over 100 nested patterns and terms). However, there are a few optimization opportunities I can take as future work:

  • Caching binders in Bind to speed up free variable calculuation, which is helpful when we need to perform freshening
  • Caching the permutations in Bind and apply them lazily to the variables
  • Use local freshness instead of global freshness
  • Implement a faster match for alpha equivalence of binders

In particular, the first optimization could potentially help with the performance of locally nameless unbound as well since free variable calculation will be possible in constant time.

The last optimization opportunity was recognized when I read Nominal Matching and Alpha-Equivalence. In the case of linear patterns, this paper presents an algorithm that runs in log-linear time.

hengchu avatar May 09 '17 18:05 hengchu