`rescript format` doesn't escaped bytes in strings correctly
Given the following file, test.ml:
let square x = x * x
let a = "\000\000\246\255\247\255"
running ./darwin/bsc.exe test.ml results in:
// Generated by ReScript, PLEASE EDIT WITH CARE
'use strict';
function square(x) {
return Math.imul(x, x);
}
var a = "\0\0\xf6\xff\xf7\xff";
exports.square = square;
exports.a = a;
/* No side effect */
while ./rescript format test.ml results in:
let square = x => x * x
let a = "\000\000\246\255\247\255"
"\000\000\246\255\247\255" is not a valid string in ReScript and should be "\0\0\xf6\xff\xf7\xff" instead.
I tried calling Js_dump_string.escape_to_string from res_printer.ml, but napkin doesn't have access to anything from the core module since it appears first in bsc_libs in ninja.js. This is the error message I get when trying to build:
FAILED: ../darwin/bsc.exe
File "_none_", line 1:
Error: No implementations provided for the following modules:
Js_dump_string referenced from napkin/napkin.cmxa(Res_printer)
rescript: [2/979] ../darwin/rescript.exe
I tried moving napkin to the bottom of bsc_libs, but got a different error:
FAILED: ../darwin/bsc.exe
File "_none_", line 1:
Error: No implementations provided for the following modules:
Reactjs_jsx_ppx_v3 referenced from frontend/frontend.cmxa(Ppx_entry)
FAILED: cannot make progress due to previous errors.
Unfortunately, frontend uses Reactjs_jsx_ppx_v3 which lives in napkin so I'd have to move frontend to be below napkin which leads to more errors.
Is there a better way to give napkin/res_printer.ml access to Js_dump_string.escape_to_string?
I ended up copying escape_to_string over to res_printer.ml. Unfortunately, "\000\000\246\255\247\255" hasn't been converted to bytes yet so calling escape_to_string does nothing. Looks like parseStringLiteral from res_core.ml might do the trick.
I was able to get things working for ./rescript format, but copying a bunch of code into res_printer.ml isn't a viable solution. I still haven't figured out does the work of what code path leads ./darwin/bsc.exe get this right.
This seems to be a purely a syntax issue, it should be fixed on the syntax repo
Looking at parseStringLiteral in res_core.ml, it parses \123 as decimal and supports \o but in JS \123 is octal and \o isn't supported. I can change the string parser to match JS strings. Given that octal isn't supported in "strict mode" in JS, is octal something we want to support in ReScript?
I did some more experimenting with JS octal escapes, and they're going to be annoying to parse because they can be variable number of characters, for instance both "\70" and "\125" are value (they correspond to "8" and "U" respectively).
The rescript-lang/syntax repo is obsolete and will be archived soon. If this issue is still relevant, please reopen in the compiler repo (https://github.com/rescript-lang/rescript-compiler) or comment here to ask for it to be moved. Thank you for your contributions.