hazel Evaluation using the environment model rather than substitution

Evaluation with environments rather than environments. Evaluation with environments is useful because:

Usually (?) better performance (i.e., don't substitute in unevaluated branches)
Can memoize hole instances with the same environment/instantiation (to handle https://github.com/hazelgrove/hazel/issues/536)

Rough summary of changes:

New data structure: EvalEnv.t (perhaps should be renamed? E.g., to MemoEnvironment?) which represents an environment tagged with an ID (EvalEnvId.t). Note that unique ID's are generated for an environment each time a new one is created by using EvalEnvIdGen.t (very similar to MetaVarGen.t). Note also that EvalEnv.t maps Var.t to EvaluatorResult.t rather than to DHExp.t, but the choice is somewhat arbitrary
Change interface of evaluation functions (Evaluator.evaluate, Builtin.evaluate). evaluate: DHExp.t => EvaluatorResult.t becomes evaluate: (EvalEnvIdGen.t, EvalEnv.t, DHExp.t): (EvalEnvIdGen.t, EvaluatorResult.t)
"Generalized closures" to represent any expression that can have a delayed evaluation (e.g., used to encapsulate let/case expr with failed match, lambda closure, hole closure), also neatens up some of the hole handling
Implement a "postprocessing step" (EvalPostprocess.postprocess) that performs two actions:
- Perform substitution inside closures (lambda bodies, bodies of let/case exprs with failed matches) so that the result matches that of substitution (i.e., context inspector should show same result as before)
- Number hole instances. Notably, in order to solve https://github.com/hazelgrove/hazel/issues/536, holes that share the same environment are given the same number, i.e., each instantiation of a hole is a single "hole closure", potentially with multiple parents/paths, rather than being split into multiple "hole instances" each time it is encountered in the result. (This causes HoleInstance.t, HoleInstancePath.t, and HoleClosureInfo.t to become HoleClosure.t, HoleClosureParents.t, and HoleClosureInfo.t. Also requires slight changes to the context inspector to display all parents to a hole closure, rather than the (singular) path to the current hole instance.). The old hole instance numbering step, Program.renumber, is somewhat simplified and grouped into this postprocessing step.
For performance reasons, change environments to use a BST implementation rather than a linked list. (VarBstMap.t -- currently replaces uses of VarMap.t only in EvalEnv.t but doesn't affect other usages of VarMap.t)
Implement DHExp.fast_equals for fast equality checking memoized by environments, which doesn't recurse through environments, and Result.fast_equals, which performs DHExp.fast_equals on the result expression and on each expression in each environment
No more need of Evaluator.subst* functions during evaluation. (Substitution is currently being used in one place only: during elaboration when introducing the FixF form for let expressions with an type-ascribed lambda binding)
Evaluation of recursive functions still uses FixF, although it is possible to use recursive environments to achieve this (achieved in the rec-closures branch as a proof-of-concept).

This branch is a work in progress, probably needs more testing and maybe some changing of module interfaces.

Planning to branch off this branch to work on fill-and-resume. The implementation shouldn't be too different from evaluation with substitution, but we can benefit from the optimization in https://github.com/hazelgrove/hazel/issues/536.

I'm also working on a report to try to formalize and clarify some of the above changes, as well as future work with fill-and-resume, but it's not very complete yet.

Worked with Yuchen Jiang (@LighghtEeloo) on this.

Mar 09 '22 05:03 jlam55555

@jlam55555 do you have a timeline for getting this PR ready to review/merge?

Jun 15 '22 20:06 cyrus-

@cyrus-

@jlam55555 do you have a timeline for getting this PR ready to review/merge?

I took an impromptu month-long break after school ended because I couldn't bring myself to do anything academic (got caught up in graduation, moving, housefinding, vacation, and COVID). Now that I've recovered from COVID and things are sorted out I think I can kick myself into high gear again for a sprint over the next two weeks. I'll get started right away!

Proposed timeline: I will try to get this branch up for review by next Friday. Tasks:

Redo the hole numbering step/memoization (put this in a separate postprocessing step so that postprocessing can be more simply memoized)
Clean up the data structures related to environment numbers (improved version in the fill-and-resume-backend branch, which was forked from this branch -- I'll recreate it on this branch)
Make this branch up-to-date with dev
Add a PHI for the thesis work and create a TODO list of related tasks

Jun 15 '22 22:06 jlam55555

Sounds good, thanks @jlam55555!

Jun 16 '22 01:06 cyrus-

I made the changes mentioned here.

See the new PHI.

Jul 01 '22 23:07 jlam55555

What if we monadified the state in the evaluator, like in this branch I wrote two weeks ago?

Jul 05 '22 15:07 mirryi

Some organizational thoughts:

Feels like there's a bit of redundancy between some modules and data types, particularly DHExp.evalenv also being EvalEnv.t and DHExp.result = EvaluatorResult.t and EvalEnv.result_map = right component of DHExp.evalenv. I reckon there could be a way to reorganize these to streamline the module dependencies and make it easier to follow.
Maybe this is just me overthinking these modules, but I feel it would be better if the EvalEnv functions that take EvalState.t didn't, instead taking some environment id generator (e.g. EvalIdGen). It seems like EvalState should only be used in Evaluator.
Perhaps the Eval* modules should be Evaluator* to maintain naming consistency.

Jul 11 '22 15:07 mirryi

Some organizational thoughts:

Feels like there's a bit of redundancy between some modules and data types, particularly DHExp.evalenv also being EvalEnv.t and DHExp.result = EvaluatorResult.t and EvalEnv.result_map = right component of DHExp.evalenv. I reckon there could be a way to reorganize these to streamline the module dependencies and make it easier to follow.

Maybe this is just me overthinking these modules, but I feel it would be better if the EvalEnv functions that take EvalState.t didn't, instead taking some environment id generator (e.g. EvalIdGen). It seems like EvalState should only be used in Evaluator.

Perhaps the Eval* modules should be Evaluator* to maintain naming consistency.

@mirryi Thanks for your comments Eric! These are some very valid points.

I think this was the very first major decision that I ran into, and one that persisted throughout until I got used to it. The redundancy you notice is due to the fact that we have a dependency loop (DHExp.Environment -> DHExp.result -> DHExp.t -> DHExp.Closure -> DHExp.Environment) and want DHExp.result to be in its own module, hence EvaluatorResult.t === DHExp.result. (Note that this is also quite annoying because it causes constant boxing/unboxing of the Indet and BoxedValue judgments for the environment definiens, even when they're usually irrelevant.) The alternative (which gets rid of this dependency loop) is to have environments map variables to DHExp.t rather than EvaluatorResult.t (essentially, the expression without the "final" judgment) -- but this forces us to recompute the final judgment each time we need it. In hindsight, the latter may not be so bad -- a call to Evaluator.evaluate on a final DHExp.t just to get the final judgment for the expression shouldn't take too many evaluation steps. I'm not too sure if that will truly be beneficial though.
Your suggestion makes sense. Originally, EvalState was EvalEnvIdGen, since this was the only state carried through evaluation. Only in a recent commit was it renamed to EvalState and additional fields get added to EvalState (e.g., gas counter). So yes, it would make sense to make the id generator a member of EvalState and only have that passed to the EvalEnv methods.
I have a bit of naming indecision. Originally, I was afraid of long names like EvaluatorEnvironment.t (since we have the Evaluator module and the Environment.t) or even longer like EvaluatorEnvironmentIdGenerator.t, etc. I think EvaluatorState would make sense, but perhaps the EvalEnv* names should be overhauled completely. EvalEnv's are just numbered environments, and their name should better reflect that.

I totally agree that 1. and 3. are some of the uglier parts of the implementation and can be improved.

Another part that may be of interest is the postprocessing step in EvalPostprocess.re: I already cleaned this up from the original implementation from my thesis, but it still feels overcomplicated. For example, there's the memoization of postprocessing by environments, tracking of holes and hole instantiation numbers using HoleClosureInfo_.t and HoleClosureInfo.t, and a separate stage for hole parent tracking. If anything, I think it's easier to understand (at least from the comments and naming) than the old postprocessing in Program.renumber, but it's still fairly abstruse.

Jul 11 '22 19:07 jlam55555

I think this was the very first major decision that I ran into, and one that persisted throughout until I got used to it. The redundancy you notice is due to the fact that we have a dependency loop (DHExp.Environment -> DHExp.result -> DHExp.t -> DHExp.Closure -> DHExp.Environment) and want DHExp.result to be in its own module, hence EvaluatorResult.t === DHExp.result. (Note that this is also quite annoying because it causes constant boxing/unboxing of the Indet and BoxedValue judgments for the environment definiens, even when they're usually irrelevant.) The alternative (which gets rid of this dependency loop) is to have environments map variables to DHExp.t rather than EvaluatorResult.t (essentially, the expression without the "final" judgment) -- but this forces us to recompute the final judgment each time we need it. In hindsight, the latter may not be so bad -- a call to Evaluator.evaluate on a final DHExp.t just to get the final judgment for the expression shouldn't take too many evaluation steps. I'm not too sure if that will truly be beneficial though.

Hmm yeah this is tricky. Along the lines of the above, we could maybe introduce something like EvaluatorResult.wrap : DHExp.t -> EvaluatorResult.t which computes the final judgment in constant time. As you said, I'm not sure if that would be worth it just for the sake of cleaning up the modules/types. Alternatively, and probably not worth it, but apparently mutually recursive modules are a thing? If we went that route, we probably wouldn't be able to define them in separate files.

I also noticed that DHExp.environment isn't used at all (unless I missed something). DHExp.evalenv can probably renamed to that?

Jul 11 '22 20:07 mirryi

Also, it would be good to have a brief doc to help people understand how the new environment-based evaluator differs from the previous one, maybe specifically discussing differences from the Hazelnut Live paper, without having to read the PHI or the thesis.

Jul 14 '22 16:07 mirryi

Styling in the context inspector seems to have broken: Screenshot from 2022-07-15 15-14-11

compare to dev:

Screenshot from 2022-07-15 15-14-55

Jul 15 '22 19:07 cyrus-

@mirryi

I think this was the very first major decision that I ran into, and one that persisted throughout until I got used to it. The redundancy you notice is due to the fact that we have a dependency loop (DHExp.Environment -> DHExp.result -> DHExp.t -> DHExp.Closure -> DHExp.Environment) and want DHExp.result to be in its own module, hence EvaluatorResult.t === DHExp.result. (Note that this is also quite annoying because it causes constant boxing/unboxing of the Indet and BoxedValue judgments for the environment definiens, even when they're usually irrelevant.) The alternative (which gets rid of this dependency loop) is to have environments map variables to DHExp.t rather than EvaluatorResult.t (essentially, the expression without the "final" judgment) -- but this forces us to recompute the final judgment each time we need it. In hindsight, the latter may not be so bad -- a call to Evaluator.evaluate on a final DHExp.t just to get the final judgment for the expression shouldn't take too many evaluation steps. I'm not too sure if that will truly be beneficial though.

Hmm yeah this is tricky. Along the lines of the above, we could maybe introduce something like EvaluatorResult.wrap : DHExp.t -> EvaluatorResult.t which computes the final judgment in constant time. As you said, I'm not sure if that would be worth it just for the sake of cleaning up the modules/types. Alternatively, and probably not worth it, but apparently mutually recursive modules are a thing? If we went that route, we probably wouldn't be able to define them in separate files.

I also noticed that DHExp.environment isn't used at all (unless I missed something). DHExp.evalenv can probably renamed to that?

I didn't know about recursive modules, I wonder if that has any adverse side effects (i.e., performance issues or workarounds needed). Something to consider in the future if it significantly improves readability.

I'll leave the naming of DHExp.evalenv to your discretion. You're right, DHExp.environment isn't used and probably comes from a very early iteration of environments for evaluation, so it can be removed. But it may be a little confusing to rename DHExp.evalenv to DHExp.environment since there's also an Environment.t which is different. Again, I don't really like the names but I haven't thought of a better one. I already wrote about how EvalEnv* is probably a bad name. Another bad name is HoleClosure*, since I think HoleInstantiation* is a better name for the concept but haven't gotten around to renaming all the related symbols.

Also, it would be good to have a brief doc to help people understand how the new environment-based evaluator differs from the previous one, maybe specifically discussing differences from the Hazelnut Live paper, without having to read the PHI or the thesis.

While the thesis report is long (probably too long for the changes made), I don't think that the PHI is too long or obscure, and its purpose is to document this change. I agree that it may benefit from a section explicitly listing the differences from the dev evaluator. Let me know if there's anything else that should be clarified/reworded/elaborated on in the PHI.

@cyrus-

Styling in the context inspector seems to have broken:

compare to dev:

Thanks for pointing out, fixed in https://github.com/hazelgrove/hazel/pull/586/commits/ce73c18aa56030bf6437e8d28f0120d01544c9a7

This brings up the related point that there's no special styling for the "Hole Closure Parents" pane in the context inspector right under the misstyled one, so it's pretty ugly for now especially when the hole instantiation is directly in the result or has multiple parents. I'll leave it to someone else since my design sense is off the charts (ugly).

Jul 15 '22 21:07 jlam55555

Yeah I was also finding all the things called "environment"s confusing. There seem to be evaluation environments and hole closure environments. Can we unify the two? They both seem to be mappings from variables to DHExps at heart, though eval environments map to a result (is that useful for something, or just easier not to pull out the DHExp?)

Jul 17 '22 20:07 cyrus-

Bug: getting an error (Async_kernel: Unhandled exception (Invalid_argument List.for_all2)) when pressing backspace at this position:

Either from Statics_Exp.syn_operand or ListUtil.for_all2_opt or either DHExp.fast_equals or Result.fast_equals, but probably the latter since no such bug exists in dev.

Jul 18 '22 16:07 mirryi

getting a stack overflow error when I press : to add a type annotation here Screenshot from 2022-08-05 00-01-00

Aug 05 '22 04:08 cyrus-

@cyrus- Merged new web worker stuff in and played around with it a little. One big problem is that after evaluation and postprocessing, for programs with lots of holes, sending the result back to the main thread from the worker can be incredibly slow, as we serialize the result in an sexp on the worker side and deserialize again on the main thread (because copying raw data didn't work for some reason).

This is mostly because, I think, we have closure environments in their entirety in (1) each hole closure, (2) the hole instance info, making the serialized result large. To alleviate this, we could store all closure environments in a separate map and keep only the ids in the expression tree and hole info. However, this would make the DHExp.t datatype not self-containing. Alternatively, we could figure out how to make data transfer between threads faster some other way, but I'm not sure how to do this.

Aug 19 '22 15:08 mirryi

qsort example doesn't seem to be working -- local variables (smaller, bigger, etc.) are not being shown in the sidebar

Screenshot from 2022-08-22 15-34-07

Aug 22 '22 19:08 cyrus-

merged into haz3l-tests, thanks!

Sep 08 '22 02:09 cyrus-

hazel hazel copied to clipboard

Evaluation using the environment model rather than substitution

hazel
hazel copied to clipboard