rhombus-prototype icon indicating copy to clipboard operation
rhombus-prototype copied to clipboard

Make an RFC for extensible local definitions

Open sorawee opened this issue 5 years ago • 8 comments

To quote @jackfirth:

adding some sort of #%local-definitions hook that gets inserted into internal definition contexts and which modules can override the same way they can override #%app

Is it possible to use this mechanism to deal with define in internal definition context in a more principled way? Can we use it to implement define* in #46?

One application @jackfirth has in mind is to count how many times variables are used in a body by using this #%local-definitions.

Here are some constraints:

To deal with define, we need to partially expand things inside it with define-values and define-syntaxes in the stop list. We then need to collect them and compile them down to letrec-syntaxes+values. Adding a support to define* needs an ability to add define* to the stop list (since it’s not expressible with define-values), and also need a new intermediate core form that can handle the shadowing nature of define* (since it can’t be done with letrec-syntaxes+values). These seem to demand something more than #%local-definitions mentioned above.

Another concern is that it must be composable enough that the overrider can reuse the previous #%local-definitions and doesn’t need to reinvent all the wheel.

@lexi-lambda, would this make anything in https://lexi-lambda.github.io/blog/2018/09/13/custom-core-forms-in-racket-part-ii-generalizing-to-arbitrary-expressions-and-internal-definitions/ easier?

sorawee avatar Jul 25 '19 03:07 sorawee

Some examples of #%local-definitions usage:

  • define*. E.g.,
    (let ()
      (define* x 1)            ;; Expands to (let ([x 1])
      (define* x (add1 x).     ;;              (let ([x (add1 x)])
      x))                      ;;                x))
    
  • Decorators. E.g.,
    (let ()
      (@ trace)
      (@ memoize)
      (define (fib n) ...)
      (fib 10))
    
  • @jackfirth's guard forms. E.g.,
    ;; return either an Error or a Result object
    (define (handle x) ;; expand to an if expression, no continuation object involved
      (guard-when (<= x 0) (Error "must be positive"))
      (Result (* x x)))
    
  • Simplified Typed Racket's colon form. E.g.,
    (let ()
      (: Number -> Number)
      (define (fact n) ...)
      (: String -> String)
      (define (double s) ...)
      (fact 10))
    

For example, in the decorator example, the language would provide a custom #%local-definitions that recognizes adjacent (@ ...) and (define-values ...) (after partial expansion) and fuse them together. let-values will automatically wrap the whole body with #%local-definitions. After the fusion, there would be two subforms left (define-values (fib) ... fused-expr ...) and (fib 10). #%local-definitions would then put the subforms in (racket:#%local-definitions ...) so that there's no further custom #%local-definitions expansion.

The module-level forms can be controlled by #%module-begin already, though it would be nice to have a way to adjust both #%module-begin and #%local-definitions at once.

sorawee avatar Jul 15 '21 00:07 sorawee

Another use case is a parameterize! form that wraps the rest of the definition context in a parameterization:

(define (f)
  (parameterize! current-foo 5)
  (do-stuff))

;; equivalent to this:
(define (f)
  (parameterize ([current-foo 5])
    (do-stuff)))

jackfirth avatar Jul 15 '21 02:07 jackfirth

I actually have another idea: what about a reader extension that reads sexps as far as possible and then group them together?

(define (f)
  #>(parameterize ([current-foo 5]))
  (do-stuff-1)
  (do-stuff-2))

is equivalent to:

(define (f)
  (parameterize ([current-foo 5])
    (do-stuff-1)
    (do-stuff-2)))

and

(define (f)
  #>(define* x 1)
  #>(define* x (add1 x))
  x)

is equivalent to:

(define (f)
  (define* x 1
    (define* x (add1 x)
      x)))

sorawee avatar Jul 15 '21 03:07 sorawee

@sorawee This sounds pretty close to parendown

Metaxal avatar Jul 15 '21 07:07 Metaxal

It's very close indeed, and I think the implementation is going to be almost the same.

The goal is different though. Parendown wants to reduce number of parentheses, but personally I don't care about that. I like parentheses. They are really useful for code navigation and edit.

The main goal of #> is to reduce the rightward drift, while still making the code lisp-y.

(define (f)
  #>(let ([x 1]))
  #>(let ([x (add1 x)]))
  x)

is no different from

(define (f)
  (define* x 1)
  (define* x (add1 x))
  x)

where define is redefined to cooperate with define*.

With Parendown, the code would be written as:

(define (f)
  (let ([x 1])
  #/let ([x (add1 x)])
  x))

which doesn't quite match the structure that I have in mind, and kinda asymmetric.

sorawee avatar Jul 15 '21 08:07 sorawee

The goal is different though. Parendown wants to reduce number of parentheses, but personally I don't care about that. I like parentheses. They are really useful for code navigation and edit.

The main goal of #> is to reduce the rightward drift, while still making the code lisp-y.

The metric I've considered Parendown to excel at is reducing the number of lines of code. It does that by reducing rightward drift, since too much indentation eventually crams things against the margin where the code wraps across a lot more lines. A pyramid of indentation has a size like O(n ^ 2), which Parendown's flattened style reduces to O(n).

I also consider Parendown to be handy for reducing the footprint of refactoring actions in small ways. A debug logging wrapper can be added around some code without affecting its indentation level or even requiring a ) to be written somewhere off in the distance.

I consider Parendown's effect on the number of parentheses to be negligible. I refer to #/ as a weak opening parenthesis, and it replaces one instance of a ( ) pair, so if a program had 2n parentheses in it before, the use of #/ brings it to somewhere between n and 2n. That's O(n) parentheses either way.

They are really useful for code navigation and edit.

For Parendown's weak opening parens to fit into a paren-aware editing experience seamlessly, the editor would need to be not only paren-aware but weak-opening-paren-aware. I think extending editors this way is conceivable, but I also sympathize with different approaches that can take advantage of existing editors just as they are.

I like your approach for that. It addresses rightward drift just as well but has more obvious indentation rules that existing tools will probably respect.

If it's feasible enough, I'd like to suggest (#>foo ...) rather than #>(foo ...) so that people who want to can use non-monospace fonts and/or indent with tabs instead of spaces.

rocketnia avatar Jul 24 '21 15:07 rocketnia

Can you elaborate on the following a bit more? I don't think I understand the point about tabs vs spaces and non-monospace fonts.

I'd like to suggest (#>foo ...) rather than #>(foo ...) so that people who want to can use non-monospace fonts and/or indent with tabs instead of spaces.

There are three reasons why I want to put it outside of parentheses

  1. It can be implemented far easier.
  2. It looks similar to existing reader macros. E.g., #'(...), #,(...)
  3. It signals to readers up front that the following parentheses do not contain all information, and that they should continue reading after that.

sorawee avatar Jul 24 '21 22:07 sorawee

  1. It looks similar to existing reader macros. E.g., #'(...), #,(...)

I don't really like this aspect of #'(...) and #,(...) either. I'd rather every indentation-affecting ( appear all the way on the far left side of the line it's on.

Honestly, this opinion of mine isn't all that strongly held, but I'll explain my motivations since you're asking.

In the typical Lisp indentation style, the indentation of each line depends on the column of the most recent unmatched (.

#>(blah
    #'(blah
        #hash((a . 1)
              (b . 2))))

This way, when it comes time to write )))) on the last line, we can zig-zag up the left edge and count the number of ( to be sure we get the number of ) right.

We only have to pay attention to the left edge; the rest is ignorable:

#>(b~~~
    #'(b~~~
        #hash(~~~~~~~
              (b . 2))))

But the "column" of a ( character isn't particularly well-defined unless we use a monospaced font and use only spaces for indentation. These conditions are pretty typical constraints of writing Lisp code, but I don't think they should be taken for granted when we have the opportunity to design a new language.

If we suppose someone uses a non-monospaced font and always uses a single tab as wide as "..." for each indentation level, the above code would likely be written like this:

#>(blah ...#'(blah ......#hash( .........(a . 1) .........(b . 2))))

(I'm using "..." because outside of a code block, GitHub allows runs of spaces to be collapsed into a single space.)

The zig-zagging experience there is a bit rough. The user has to scan a little bit into each line looking for unmatched (, because -- at least with the font I'm seeing here on GitHub -- just following the left edge in the typical way doesn't work:

#~~~~ ...#'~~~ ......#~~~ .........~~~ .........(b . 2))))

I'm stubborn enough that I might write the code like this just to recover the zig-zagging technique:

#> (blah ...#' ...( ......blah ......#hash ......( .........(a . 1) .........(b . 2))))

But I feel it would be quite a bit more tolerable for every syntax to put its opening parens on the far left:

(##> blah ...(syntax (##>) blah ......(##hash .........(a . 1) .........(b . 2))))

Incidentally, this makes things like #hash look even more like s-expressions than they did before. This Lispier-than-Lisp style potentially makes it easy to refactor between a reader macro call like (##hash ...), a regular macro call, and a function call. In particular, this refactoring wouldn't change the indentation level.

  1. It can be implemented far easier.

One way to approach the implementation of my examples above would be to modify the readtable entry for ( so that it peeks ahead to see if ## appears after it. If it does, then it reads that ##, reads the following symbol (in this case > or hash), and dispatches to another readtable entry based on that symbol. (Or, for a less generalized design, it can just peek for #>.)

But I should admit there are other complications I've encountered with this opening-paren-on-the-left-edge policy.

  • One drawback is that certain operations may not delimit their bodies as clearly as before without extra punctuation. For instance, I would like to apply this policy to string-like syntaxes, but the extent of the body in (##rx :hello) seems a little less clear to me than it does in #rx"hello".
  • The act of commenting out an s-expression, simply #;<expr> in Racket, ends up being done in two separate ways depending on whether it's parenthesized. That's probably a little more cognitive overhead than a quick commenting-out syntax should have.
  • Probably the quirkiest convolution I've arrived at is deciding that string escape sequences of the form \( \) violate this policy and that I'd therefore prefer to write some escape sequences backwards and others forwards, like (\ \).

I'm not sure how I feel about these edge cases. This uncertainty will probably lead me into unnecessarily inconsistent designs, which could indeed lead to some unnecessary complexity in the implementation.

rocketnia avatar Jul 25 '21 07:07 rocketnia