tools icon indicating copy to clipboard operation
tools copied to clipboard

Normalize numeric literals

Open MichaReiser opened this issue 4 years ago • 5 comments

Prettier normalizes numeric literals, e.g. by removing unnecessarily trailing zeros but Rome doesn't. Playground

Input

// Add 0
.1
// Remove .
1.

// B -> b, O -> o, X -> x
0B1;
0O1;
0X1;

// X -> x, HEX digits to lowercase
0X123abcdef456ABCDEF

// E -> e
1.1_0_1E1;
// Remove +
1e+1;

// Remove .
1.e1;
// To 0.1e1
.1e1;

// Remove leading 0
1.1e0010
// Remove +, add leading 0
.1e+0010
// Add leading 0, remove unnecessarily leading 0
.1e-0010

// Simplify to 0.5
0.5e0;
0.5e00;
0.5e+0;
0.5e+00;
0.5e-0;
0.5e-00;

// Trim trailing zeros
1.00500;

// Add 0
.1_1;
// A 
0xa_1;
 // X -> x, A -> a
0XA_1;

Prettier

// Add 0
0.1;
// Remove .
1;

// B -> b, O -> o, X -> x
0b1;
0o1;
0x1;

// X -> x, HEX digits to lowercase
0x123abcdef456abcdef;

// E -> e
1.1_0_1e1;
// Remove +
1e1;

// Remove .
1e1;
// To 0.1e1
0.1e1;

// Remove leading 0
1.1e10;
// Remove +, add leading 0
0.1e10;
// Add leading 0, remove unnecessarily leading 0
0.1e-10;

// Simplify to 0.5
0.5;
0.5;
0.5;
0.5;
0.5;
0.5;

// Trim trailing zeros
1.005;

// Add 0
0.1_1;
// A
0xa_1;
// X -> x, A -> a
0xa_1;

Rome

// Add 0
.1;
// Remove .
1.;

// B -> b, O -> o, X -> x
0B1;
0O1;
0X1;

// X -> x, HEX digits to lowercase
0X123abcdef456ABCDEF;

// E -> e
1.1_0_1E1;
// Remove +
1e+1;

// Remove .
1.e1;
// To 0.1e1a
.1e1;

// Remove leading 0
1.1e0010;
// Remove +, add leading 0
.1e+0010;
// Add leading 0, remove unnecessarily leading 0
.1e-0010;

// Simplify to 0.5
0.5e0;
0.5e00;
0.5e+0;
0.5e+00;
0.5e-0;
0.5e-00;

// Trim trailing zeros
1.00500;

// Add 0
.1_1;
// A
0xa_1;
// X -> x, A -> a
0XA_1;

Expected

Rome to normalize octal, hex, and byte escapes as well as exponential formats the same as Prettier does.

MichaReiser avatar Apr 14 '22 10:04 MichaReiser

This normalization is actually number-to-string from the spec.

I think we can use https://crates.io/crates/ryu-js, it's from https://github.com/boa-dev/boa

Ryū-js is a fork of the ryu crate adjusted to comply to the ECMAScript number-to-string algorithm.

Boshen avatar Apr 14 '22 10:04 Boshen

This normalization is actually number-to-string from the spec.

I think we can use https://crates.io/crates/ryu-js, it's from https://github.com/boa-dev/boa

Ryū-js is a fork of the ryu crate adjusted to comply to the ECMAScript number-to-string algorithm.

This library accepts a float and converts it to a string. The formatter on the other hand operates on a string input. The library also looks rather complicated, I was hoping we would get away with something less sophisticated. For example, all that prettier does is to run some Regular Expressions:

  return (
    rawNumber
      .toLowerCase()
      // Remove unnecessary plus and zeroes from scientific notation.
      .replace(/^([+-]?[\d.]+e)(?:\+|(-))?0*(\d)/, "$1$2$3")
      // Remove unnecessary scientific notation (1e0).
      .replace(/^([+-]?[\d.]+)e[+-]?0+$/, "$1")
      // Make sure numbers always start with a digit.
      .replace(/^([+-])?\./, "$10.")
      // Remove extraneous trailing decimal zeroes.
      .replace(/(\.\d+?)0+(?=e|$)/, "$1")
      // Remove trailing dot.
      .replace(/\.(?=e|$)/, "")
  );

MichaReiser avatar Apr 14 '22 11:04 MichaReiser

Do you think we could achieve that with only string manipulation? We can't use regex unfortunately (it's better not to)

ematipico avatar Apr 14 '22 11:04 ematipico

hmm ... number parsing hasn't been used anywhere yet, maybe this is the time? I'm afraid that any adhoc algorithm will eventually end up similar to the number-to-string algorithm.

https://github.com/rome/tools/blob/a4702bacc973306453a01be4ad04a2c03104d376/crates/rome_js_syntax/src/expr_ext.rs#L366-L370

Boshen avatar Apr 14 '22 11:04 Boshen

Do you think we could achieve that with only string manipulation? We can't use regex unfortunately (it's better not to)

Sure. A RegEx is just a state machine. The only question is how much code it requires. But I haven't looked into the implementation yet. I'm only concerned with finding all discrepancies.

@Boshen I'm not sure if this is implementing the toString algorithm. For example:

0x01.toString() 

Returns 1 but Prettier prints it as 0x01 because that's closest to what the user wrote.

MichaReiser avatar Apr 14 '22 12:04 MichaReiser

Duplicate of #3294

MichaReiser avatar Oct 11 '22 07:10 MichaReiser