Packages icon indicating copy to clipboard operation
Packages copied to clipboard

[AppleScript] sregex compatibility

Open michaelblyons opened this issue 10 months ago • 6 comments

See #481

There's lots more that could be done to improve this package, but I've killed the lookbehinds and added a couple of subscopes. Also, there's a test file for the first time.

michaelblyons avatar Feb 13 '25 21:02 michaelblyons

Thanks for working on this.

I still have some left-overs from rainy days from autumn with what I started with while going through this package. Should I send you a zip via Discord?

jrappen avatar Feb 13 '25 21:02 jrappen

Okay, now I'm doing more work on it.

michaelblyons avatar Feb 14 '25 17:02 michaelblyons

Not a full rebuild from spec and there are definitely some broken pieces, but it's lots better. Has some good building contexts if someone wants to push forward with Apple's crazy blocks.

michaelblyons avatar Feb 14 '25 19:02 michaelblyons

I noticed a few things:

  • Language constants (as an example, there are many others), are they actually \b(?i:false)\b or rather \b(?:[Ff]alse|FALSE)\b? Seemed odd at first glance over.
  • If you match a group of words, sometimes you use (one two), sometimes (one\s+two).
  • You could make use of:
    variables:
      leading_wspace: (?:^\s*)
    
    as you use it a lot.
  • You can just indent the lines you test a bit, makes writing the assertions on the following lines easier. Like:
      test
    # ^^^^ scope
    
    instead of:
    test
    # <- scope
    #^^^ scope
    

jrappen avatar Feb 14 '25 21:02 jrappen

  • Language constants (as an example, there are many others), are they actually \b(?i:false)\b or rather \b(?:[Ff]alse|FALSE)\b? Seemed odd at first glance over.

Most of those were inherited. I guess I could check them against the spec, but I did see a line in the spec saying that almost everything is case insensitive.

  • If you match a group of words, sometimes you use (one two), sometimes (one\s+two).

Most of this is inherited, too. The only part I remember with a space is in the raw Unicode part, which isn't fully spec-conforming yet anyway.

  • You could make use of:
    variables:
      leading_wspace: (?:^\s*)
    

I guess I could, but {{leading_wspace}} is longer than ^\s*.

  • You can just indent the lines you test a bit, makes writing the assertions on the following lines easier. Like:

      test
    # ^^^^ scope
    

    instead of:

    test
    # <- scope
    #^^^ scope
    

Good thought. Will change.

michaelblyons avatar Feb 15 '25 15:02 michaelblyons

  • Language constants (as an example, there are many others), are they actually \b(?i:false)\b or rather \b(?:[Ff]alse|FALSE)\b? Seemed odd at first glance over.

According to the spec, these are actually supposed to be lower case. Now the question: Do I follow the spec, or do I cajole a Mac user to try out mixed case to see if the case-insens version was determined experimentally?

michaelblyons avatar Feb 17 '25 05:02 michaelblyons

This already looks better than before. Are you going to add anything to this or can we review?

jrappen avatar Mar 13 '25 08:03 jrappen

I originally set out to reach sregex compatibility. I got a little carried away, but my aim was not to produce a perfect spec-conforming document.

There is specific handling for certain program's arguments in there. And I cannot test as I do not have MacOS.

Review away. 🙂

michaelblyons avatar Mar 13 '25 08:03 michaelblyons

image

According to the docs you linked, keywords are lower-case while identifiers are case-insensitive. There's a table with all reserved words, maybe double-check against that with the suggested changes as I found a (?i:false) somewhere in there and stopped checking.

jrappen avatar Mar 20 '25 14:03 jrappen

  • [x] If you use pop: 1 in the added lines, you should bump to v2 and replace remaining pop: true from unchanged lines. If tests fail, there's a list of differences between v1 and v2 at https://www.sublimetext.com/docs/syntax.html#compatibility, but you knew that.
  • [x] Were you going to add some of the completions I sent you? Some of the files are empty, and the ones that can be used might need a fix in trigger to remove spaces where there are some. But I tried to add as much to details as I could find from the offical docs, so that might be helpful. I can do it myself in a later PR too, whatever works for you.

Otherwise LGTM.

jrappen avatar Mar 24 '25 10:03 jrappen