sweet-core icon indicating copy to clipboard operation
sweet-core copied to clipboard

Began Work on Fixing Nested Syntax

Open peblair opened this issue 8 years ago • 12 comments

NOTE: This PR is not ready to merge!

I have begun working on fixing #519, but I have reached a bit of a stumbling block. I have gotten shift-reader.js to accept files with nested syntax, but something is getting garbled in between (details below), leading me to believe that I'm missing some step in the expander pipeline.

To explain where I'm at, here's what I know:

  • The reading of syntax templates (and, well, everything else) is done in src/shift-reader.js.
  • Syntax objects begin when src/shift-reader.js's advance function (master) returns a token of type LSYNTAX.
  • The contents of syntax objects are read like normal syntax; the only thing that makes them special is that the handler for LSYNTAX bundles all of the read pieces together into an immutable list.
  • The reason for the error that is mentioned in the bug report is because advance in src/shift-reader.js does not handle the ""` character (or it's escaped variant), so it delegates to the superclass (which also does not handle it).

Thus, I figured that fixing the error would just be a matter of adding the proper escapes to advance, allowing nested syntax objects to be read correctly. As far as reading the syntax goes, this all seems to work correctly. Note: This change is what the first commit of this PR reflects.

So, what happened? I tried to run this modified version on the following code:

syntax makeSayer = function(ctx) {
  let x = ctx.next().value;
  return #`syntax ${x} = function(ctx) { return \`console.log('hello world')\`; }`;
}

makeSayer hi

The result was the following error messsage:

/home/belph/git-belph/sweet.js/dist/syntax.js:174
        return s_701.addScope(scope_696, bindings_697, options_698);
                    ^

TypeError: Cannot read property 'addScope' of null
    at /home/belph/git-belph/sweet.js/dist/syntax.js:174:21
    at /home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:3018:46
    at List.__iterate (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:2208:13)
    at IndexedIterable.mappedSequence.__iterateUncached (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:3017:23)
    at seqIterate (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:606:16)
    at IndexedIterable.IndexedSeq.__iterate (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:322:14)
    at IndexedIterable.mixin.toArray (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:4260:23)
    at new List (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:2067:62)
    at reify (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:3572:37)
    at List.mixin.map (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:4403:14)

Let's first dissect the syntax object in question as follows:

<ret> = #`syntax ${x} = function(ctx) { \`console.log('hello, world')\`; }`
                                        |------- <ret_nested> --------| 

Now, here's what I can tell happens:

  • read (my fork's copy) will tokenize <ret> and <ret_nested> just fine; <ret> is an immutable list with an entry at index-whatever that is another immutable list corresponding to <ret_nested>
  • Sometime between then and the return value of the call to deserializer.read(str) in loadForCompileTime (my fork's copy), the location of <ret_nested> in <ret> is set to null. Why? Not a clue.

And that is where I'm at. Am I missing some step here? I feel like this is, if nothing else a step in the right direction; nested syntax is a powerful feature to have, so I'd love to get this fixed!

peblair avatar Jun 02 '16 15:06 peblair

So I just had a crazy idea. Syntax template are already syntactically special so what if we just don't escape them? Beginning (#``) and ending (```) delimiters are always distinguished so there is no syntactic reason to escape them. Just start a recursive read when you get to a #`` inside a read of an outer syntax template.

Normal templates still need to be escaped of course.

syntax m = ctx => {
  return #`
    syntax n = ctx => {
      let s = \`foo\`;
      return #`
        syntax o = ctx => {
          return #`1`
        }
      `
    }
  `
}

This avoids the need to count backslashes and all that.

This approach confuses existing syntax highlighters a little but since all the backticks must be balanced the confusion is localized to the syntax templates themselves.

Does this seem reasonable or am I missing something?

disnet avatar Jun 05 '16 17:06 disnet

@disnet That seems perfectly reasonable (it's the same approach Racket takes!)

peblair avatar Jun 06 '16 15:06 peblair

Oh! The trick is nested interpolations:

let s = #`
  let ss = #`foo \${42}`
`;

You'll need to count escapes for those I think.

disnet avatar Jun 06 '16 17:06 disnet

Okay, so a double escape look like \\${42}, but what would a triple escape look like? \\\${42} or \\\\${42}? If it's the latter, what about using something like Racket's unquote-syntax shorthand (#,) to avoid an exponential blowup of escapes?

peblair avatar Jun 06 '16 17:06 peblair

How would #, help?

disnet avatar Jun 07 '16 15:06 disnet

Using the standard escaping for \, one can see that #, quickly becomes considerably less typing (and is much more descriptive of the unescaping depth)

Depth Escape sequence with \ Escape Sequence with #,
0 ${42} ${42}
1 \${42} #,${42}
2 \\\${42} #,#,${42}
3 \\\\\\\${42} #,#,#,${42}
4 \\\\\\\\\\\\\\\${42} #,#,#,#,${42}

peblair avatar Jun 07 '16 16:06 peblair

Ah! I see what you meant. Yeah do that 😄

disnet avatar Jun 07 '16 17:06 disnet

@disnet if we took care of syntax templates with a combination of reader macros and syntactic extensions, we could do this:

syntax m = ctx => {
  return #`
    syntax n = ctx => {
      let s = \`foo\`;
      return ${ #`
        syntax o = ctx => {
          return ${ #`1` }
        }
      ` }
    }
  `
}

This way a syntaxTemplate macro could take care of converting nested term objects back into escaped syntax template literals and would be transformed to:

syntax m = ctx => {
  return syntaxTemplate`
    syntax n = ctx => {
      let s = \`foo\`;
      return ${ syntaxTemplate`
        syntax o = ctx => {
          return ${ syntaxTemplate`1` }
        }
      ` }
    }
  `
}

Thus m gets transformed to:

syntax n = ctx => {
  let s = \`foo\`;
  return ${ #`
    syntax o = ctx => {
      return ${ #`1` }
    }
  ` }
}

This way we don't have to invent a special escape syntax for nested interpolations and IMO looks a lot cleaner.

gabejohnson avatar Aug 17 '16 13:08 gabejohnson

@gabejohnson humm...I'm not following. Could you expand (ha!) on what you meant?

disnet avatar Aug 18 '16 05:08 disnet

😝 My point is that if you just require interpolating if you're nesting syntax templates you can avoid any special syntax and just parallel the syntax used by template literals already:

syntax m = ctx => {
  return #`
    syntax n = ctx => {
      let s = \`foo\`;
      return ${ #`
        syntax o = ctx => {
          return ${ #`${ ctx.next().value }` }
        }
      ` }
    }
  `
}

vs.

syntax m = ctx => {
  return #`
    syntax n = ctx => {
      let s = \`foo\`;
      return #\`
        syntax o = ctx => {
          return #\`\\${ ctx.next().value }\`
        }
      \`
    }
  `
}

or

syntax m = ctx => {
  return #`
    syntax n = ctx => {
      let s = \`foo\`;
      return #`
        syntax o = ctx => {
          return #`\\${ ctx.next().value }`
        }
      `
    }
  `
}

While there are more characters at shallow nesting depths, you don't have to count slashes as you nest.

gabejohnson avatar Aug 18 '16 13:08 gabejohnson

@gabejohnson I like that, I think it feels the most natural.

jlongster avatar Aug 18 '16 14:08 jlongster

Something I just thought of. Which context does the innermost ctx refer to in my scheme?

gabejohnson avatar Aug 18 '16 15:08 gabejohnson