sweet-core
sweet-core copied to clipboard
Began Work on Fixing Nested Syntax
NOTE: This PR is not ready to merge!
I have begun working on fixing #519, but I have reached a bit of a stumbling block. I have gotten shift-reader.js
to accept files with nested syntax, but something is getting garbled in between (details below), leading me to believe that I'm missing some step in the expander pipeline.
To explain where I'm at, here's what I know:
- The reading of syntax templates (and, well, everything else) is done in
src/shift-reader.js
. - Syntax objects begin when
src/shift-reader.js
'sadvance
function (master) returns a token of typeLSYNTAX
. - The contents of syntax objects are read like normal syntax; the only thing that makes them special
is that the handler for
LSYNTAX
bundles all of the read pieces together into an immutable list. - The reason for the error that is mentioned in the bug report is because
advance
insrc/shift-reader.js
does not handle the"
"` character (or it's escaped variant), so it delegates to the superclass (which also does not handle it).
Thus, I figured that fixing the error would just be a matter of adding the proper escapes to advance
, allowing nested syntax objects to be read correctly. As far as reading the syntax goes, this all seems to work correctly. Note: This change is what the first commit of this PR reflects.
So, what happened? I tried to run this modified version on the following code:
syntax makeSayer = function(ctx) {
let x = ctx.next().value;
return #`syntax ${x} = function(ctx) { return \`console.log('hello world')\`; }`;
}
makeSayer hi
The result was the following error messsage:
/home/belph/git-belph/sweet.js/dist/syntax.js:174
return s_701.addScope(scope_696, bindings_697, options_698);
^
TypeError: Cannot read property 'addScope' of null
at /home/belph/git-belph/sweet.js/dist/syntax.js:174:21
at /home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:3018:46
at List.__iterate (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:2208:13)
at IndexedIterable.mappedSequence.__iterateUncached (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:3017:23)
at seqIterate (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:606:16)
at IndexedIterable.IndexedSeq.__iterate (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:322:14)
at IndexedIterable.mixin.toArray (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:4260:23)
at new List (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:2067:62)
at reify (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:3572:37)
at List.mixin.map (/home/belph/git-belph/sweet.js/node_modules/immutable/dist/immutable.js:4403:14)
Let's first dissect the syntax object in question as follows:
<ret> = #`syntax ${x} = function(ctx) { \`console.log('hello, world')\`; }`
|------- <ret_nested> --------|
Now, here's what I can tell happens:
-
read
(my fork's copy) will tokenize<ret>
and<ret_nested>
just fine;<ret>
is an immutable list with an entry at index-whatever that is another immutable list corresponding to <ret_nested> - Sometime between then and the return value of the call to
deserializer.read(str)
inloadForCompileTime
(my fork's copy), the location of<ret_nested>
in<ret>
is set tonull
. Why? Not a clue.
And that is where I'm at. Am I missing some step here? I feel like this is, if nothing else a step in the right direction; nested syntax is a powerful feature to have, so I'd love to get this fixed!
So I just had a crazy idea. Syntax template are already syntactically special so what if we just don't escape them? Beginning (#``) and ending (```) delimiters are always distinguished so there is no syntactic reason to escape them. Just start a recursive read when you get to a
#`` inside a read of an outer syntax template.
Normal templates still need to be escaped of course.
syntax m = ctx => {
return #`
syntax n = ctx => {
let s = \`foo\`;
return #`
syntax o = ctx => {
return #`1`
}
`
}
`
}
This avoids the need to count backslashes and all that.
This approach confuses existing syntax highlighters a little but since all the backticks must be balanced the confusion is localized to the syntax templates themselves.
Does this seem reasonable or am I missing something?
@disnet That seems perfectly reasonable (it's the same approach Racket takes!)
Oh! The trick is nested interpolations:
let s = #`
let ss = #`foo \${42}`
`;
You'll need to count escapes for those I think.
Okay, so a double escape look like \\${42}
, but what would a triple escape look like? \\\${42}
or \\\\${42}
? If it's the latter, what about using something like Racket's unquote-syntax
shorthand (#,
) to avoid an exponential blowup of escapes?
How would #,
help?
Using the standard escaping for \
, one can see that #,
quickly becomes considerably less typing (and is much more descriptive of the unescaping depth)
Depth | Escape sequence with \ |
Escape Sequence with #, |
---|---|---|
0 | ${42} |
${42} |
1 | \${42} |
#,${42} |
2 | \\\${42} |
#,#,${42} |
3 | \\\\\\\${42} |
#,#,#,${42} |
4 | \\\\\\\\\\\\\\\${42} |
#,#,#,#,${42} |
Ah! I see what you meant. Yeah do that 😄
@disnet if we took care of syntax templates with a combination of reader macros and syntactic extensions, we could do this:
syntax m = ctx => {
return #`
syntax n = ctx => {
let s = \`foo\`;
return ${ #`
syntax o = ctx => {
return ${ #`1` }
}
` }
}
`
}
This way a syntaxTemplate
macro could take care of converting nested term objects back into escaped syntax template literals and would be transformed to:
syntax m = ctx => {
return syntaxTemplate`
syntax n = ctx => {
let s = \`foo\`;
return ${ syntaxTemplate`
syntax o = ctx => {
return ${ syntaxTemplate`1` }
}
` }
}
`
}
Thus m
gets transformed to:
syntax n = ctx => {
let s = \`foo\`;
return ${ #`
syntax o = ctx => {
return ${ #`1` }
}
` }
}
This way we don't have to invent a special escape syntax for nested interpolations and IMO looks a lot cleaner.
@gabejohnson humm...I'm not following. Could you expand (ha!) on what you meant?
😝 My point is that if you just require interpolating if you're nesting syntax templates you can avoid any special syntax and just parallel the syntax used by template literals already:
syntax m = ctx => {
return #`
syntax n = ctx => {
let s = \`foo\`;
return ${ #`
syntax o = ctx => {
return ${ #`${ ctx.next().value }` }
}
` }
}
`
}
vs.
syntax m = ctx => {
return #`
syntax n = ctx => {
let s = \`foo\`;
return #\`
syntax o = ctx => {
return #\`\\${ ctx.next().value }\`
}
\`
}
`
}
or
syntax m = ctx => {
return #`
syntax n = ctx => {
let s = \`foo\`;
return #`
syntax o = ctx => {
return #`\\${ ctx.next().value }`
}
`
}
`
}
While there are more characters at shallow nesting depths, you don't have to count slashes as you nest.
@gabejohnson I like that, I think it feels the most natural.
Something I just thought of. Which context does the innermost ctx
refer to in my scheme?