neverwinter.nim icon indicating copy to clipboard operation
neverwinter.nim copied to clipboard

scriptcomp: add underscore character as valid separator for integer, …

Open tinygiant98 opened this issue 11 months ago • 11 comments
trafficstars

Look! I found something else that no one asked for and probably don't want, but here it is anyway. I did this before in a much "prettier" way, but I lost that code somewhere, so here's where it ended up.

Rules for use:

  • underscore is the only valid separator character
  • can be used as many times as necessary in any number sequence
  • cannot be used in succession (only one underscore at a time)
  • cannot be the first character in an integer/float number sequence
  • can be the first character after 0x, 0b, or 0o
  • cannot be the last character in any number sequence
  • cannot be adjacent to the . in a float number sequence
  • for negative numbers, cannot be used between the negative sign and first character in the number sequence

Testing

Tested against all valid number forms:

    int a = 469752;
    int b = 469_752;
    int c = 4_6_9_7_5_2;

    int binary = 0b_0111_0010_1010_1111_1000;
    int octal = 0o_0162_5370;
    int hex = 0x_72_A_F8;

    float f = 469_752.469_752;

image

Changelog

Added

  • scriptcomp: added underscore as number separator

Licence

  • [x] I am licencing my change under the project's MIT licence, including all changes to GPL-3.0 licenced parts of the codebase.

tinygiant98 avatar Dec 03 '24 06:12 tinygiant98

Please add these test cases to make sure it didn't regress anything:

int _123 = 123;
int _0x123 = 0x123;
int _0123 = 0123;
string _0xabcd = "_0xabcd"

Assert(_123 == 123);
Assert(_0x123 == 0x123);
Assert(_0123 == 0123);
Assert(_0xabcd == "_0xabcd");

mtijanic avatar Dec 03 '24 07:12 mtijanic

Kind of confusing visually with constants already using underscores as separator

Daztek avatar Dec 03 '24 08:12 Daztek

_ is a common separator used by e.g. C#, Python and nim. C++ uses ' (e.g. 1'000'000) but that's likely because _ (or really any [A-Za-z_]) is used for literal suffixes such as 100_km

mtijanic avatar Dec 03 '24 08:12 mtijanic

Please add these test cases to make sure it didn't regress anything:

Added in d9e31dc.

tinygiant98 avatar Dec 03 '24 12:12 tinygiant98

Kind of confusing visually with constants already using underscores as separator

Yup, can be visually cluttering if they're used excessively. I used them somewhat excessively in the testing example above. It's really just a convenience feature for us old people that can't count 9 zeroes next to each other. I'm also not wedded to this change, just threw it in because it was easy.

tinygiant98 avatar Dec 03 '24 12:12 tinygiant98

@mtijanic I tried using the apostrophe as an additional separator and it quite literally froze my computer when I attempted to compile my test script. Didn't bother to dig deep into the why, but probably shouldn't be too difficult to add.

tinygiant98 avatar Dec 03 '24 12:12 tinygiant98

IMO, when in doubt, nwscript should follow the c# syntax. The languages are syntactically very close and c# was designed with some foresight. C++ often does certain things out of necessity and support for legacy cruft, not because they're a good idea.

mtijanic avatar Dec 03 '24 12:12 mtijanic

For someone that tracks changes here for a different purpose so take it fwiw, maybe there are others like the vscode language server, I'd second @mtijanic in hedging towards C# syntax. It'd be a pity if changes made, limited the use of tools like code formatters, etc over syntactical things.

In this case, if you look at Annex A I think you can simplify the rules to (for nwscript's purposes):

A numeric separator ('_') must be

  • preceded by a digit or an integer base literal prefix, and
  • followed by a digit

where 'digit' is of the appropriate base.

jd28 avatar Dec 12 '24 06:12 jd28

@jd28 Well, that's worded much more succinctly than my comment. However the rule is the same, I think, except allowing it to be used right after the 'x', 'b' or 'o'. It must be used within the number sequence and not adjacent to a decimal. I admit, however, when comparing it to the way it's written in your reference, my set of rules appears unnecessarily complex, even though it ended up in the (almost) same implementation.

tinygiant98 avatar Dec 12 '24 11:12 tinygiant98

At the risk of sounding grumpy, but do we really want to add this to the language spec? Is this a desirable feature? You're saying yourself it's niche. Can we back it with feature requests, or something actionable, before we extend the language?

This is a benign feature IMO, but the question should be asked and answered all the same.

Edit: I do not believe nwscript should pattern after C# just because C# does it. This is a discrete/separate language with a different usecase. Keeping it simple and "doing it only one way" is preferable to extending the feature matrix, unless required by a usecase.

niv avatar Dec 12 '24 11:12 niv

@niv Nope! No feature request or necessity involved in this one, more of a "just cuz I already had the code lying around". Or is it laying around?

Perfectly happy whether it goes in or not.

tinygiant98 avatar Dec 12 '24 12:12 tinygiant98