syntax icon indicating copy to clipboard operation
syntax copied to clipboard

`rescript format` doesn't escaped bytes in strings correctly

Open kevinbarabash opened this issue 4 years ago • 6 comments

Given the following file, test.ml:

let square x = x * x
let a = "\000\000\246\255\247\255"

running ./darwin/bsc.exe test.ml results in:

// Generated by ReScript, PLEASE EDIT WITH CARE
'use strict';


function square(x) {
  return Math.imul(x, x);
}

var a = "\0\0\xf6\xff\xf7\xff";

exports.square = square;
exports.a = a;
/* No side effect */

while ./rescript format test.ml results in:

let square = x => x * x
let a = "\000\000\246\255\247\255"

"\000\000\246\255\247\255" is not a valid string in ReScript and should be "\0\0\xf6\xff\xf7\xff" instead.

kevinbarabash avatar Nov 28 '21 21:11 kevinbarabash

I tried calling Js_dump_string.escape_to_string from res_printer.ml, but napkin doesn't have access to anything from the core module since it appears first in bsc_libs in ninja.js. This is the error message I get when trying to build:

FAILED: ../darwin/bsc.exe
File "_none_", line 1:
Error: No implementations provided for the following modules:
         Js_dump_string referenced from napkin/napkin.cmxa(Res_printer)
rescript: [2/979] ../darwin/rescript.exe

I tried moving napkin to the bottom of bsc_libs, but got a different error:

FAILED: ../darwin/bsc.exe
File "_none_", line 1:
Error: No implementations provided for the following modules:
         Reactjs_jsx_ppx_v3 referenced from frontend/frontend.cmxa(Ppx_entry)
FAILED: cannot make progress due to previous errors.

Unfortunately, frontend uses Reactjs_jsx_ppx_v3 which lives in napkin so I'd have to move frontend to be below napkin which leads to more errors.

Is there a better way to give napkin/res_printer.ml access to Js_dump_string.escape_to_string?

kevinbarabash avatar Nov 28 '21 22:11 kevinbarabash

I ended up copying escape_to_string over to res_printer.ml. Unfortunately, "\000\000\246\255\247\255" hasn't been converted to bytes yet so calling escape_to_string does nothing. Looks like parseStringLiteral from res_core.ml might do the trick.

kevinbarabash avatar Nov 28 '21 23:11 kevinbarabash

I was able to get things working for ./rescript format, but copying a bunch of code into res_printer.ml isn't a viable solution. I still haven't figured out does the work of what code path leads ./darwin/bsc.exe get this right.

kevinbarabash avatar Nov 28 '21 23:11 kevinbarabash

This seems to be a purely a syntax issue, it should be fixed on the syntax repo

bobzhang avatar Dec 04 '21 03:12 bobzhang

Looking at parseStringLiteral in res_core.ml, it parses \123 as decimal and supports \o but in JS \123 is octal and \o isn't supported. I can change the string parser to match JS strings. Given that octal isn't supported in "strict mode" in JS, is octal something we want to support in ReScript?

kevinbarabash avatar Dec 04 '21 22:12 kevinbarabash

I did some more experimenting with JS octal escapes, and they're going to be annoying to parse because they can be variable number of characters, for instance both "\70" and "\125" are value (they correspond to "8" and "U" respectively).

kevinbarabash avatar Dec 05 '21 00:12 kevinbarabash

The rescript-lang/syntax repo is obsolete and will be archived soon. If this issue is still relevant, please reopen in the compiler repo (https://github.com/rescript-lang/rescript-compiler) or comment here to ask for it to be moved. Thank you for your contributions.

stale[bot] avatar May 28 '23 19:05 stale[bot]