hledger-vscode icon indicating copy to clipboard operation
hledger-vscode copied to clipboard

Currency symbols don't get highlighted if not followed by a number

Open the-solipsist opened this issue 3 years ago • 4 comments

In this example, I would expect both instances of to get highlighted, but only the one followed by the number gets highlighted. So negative numbers with the minus sign after the currency symbol, or currency symbols following numbers don't get highlighted. image

the-solipsist avatar Nov 28 '22 12:11 the-solipsist

It's probably the word boundary "\b" at the end of the 'currencies' regular expression . I think I mostly expect currency symbols to be after the number.

I've added a test case to reproduce this. Not sure how to fix it easily though. You're welcome to have a go at it?

On Mon, 28 Nov 2022 at 23:55, Pranesh Prakash @.***> wrote:

In this example, I would expect both instances of ₹ to get highlighted, but only the one followed by the number gets highlighted. So negative numbers with the minus sign after the currency symbol, or currency symbols following numbers don't get highlighted. [image: image] https://user-images.githubusercontent.com/3080824/204282726-35c61df7-9bc0-404b-8041-032b06e41921.png

— Reply to this email directly, view it on GitHub https://github.com/mhansen/hledger-vscode/issues/264, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOKCDD36YTWMLPIVK43WKSTUFANCNFSM6AAAAAASNJUDH4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mhansen avatar Dec 02 '22 11:12 mhansen

Here's a robust pattern, that allows for rates in the form of @@/ @ and comments using ;/#/* as well. It tries to follow hledger's logic as far as I can tell.

In .NET/C#: (?<=\S+[ ]{2,}@?[^;#*\n]*)(""[^;\\\n]+""|[^-+.,@*;\t\n ""{}=\d]+)

in JS: (?<=\S+[ ]{2,}@?[^;#*\n]*)("[^;\\\n]+"|[^-+.,@*;\t\n "{}=\d]+)

Explanation:

  1. Lookbehind (?<=…) to check if there's a. one ore more non-whitespace characters \S+ b. followed by at least 2 blank spaces [ ]{2,} c. optionally followed by an @ symbol @? d. followed by zero or more characters that aren't comment markers [^;#*\n]*
  2. Then match either: a. quoted text without prohibited characters: "[^;\\\n]+" b. or (|) simple commodity symbol without any digits, or other prohibited characters: [^ -+.,@*;\t\n"{}=\d]+

I tried the JS version on my journal in vscode, via the find bar, it works within a split second. I'm not sure how to time it to the milliseconds.

You could translate it to the appropriate regex engine.

https://regex101.com/r/fGhYrY/6

the-solipsist avatar Dec 03 '22 15:12 the-solipsist

Hmm. There is a philosophical thing here, should we match just known allowed currencies or any string in the right place? Matching know currencies would catch more typos, which is one thing that syntax highlighting is useful for. Conversely, hledger accepts arbitrary currencies, I suppose.

Could you explain / break down your regex into parts with comments? It’s very long

On Sun, 4 Dec 2022 at 02:36, Pranesh Prakash @.***> wrote:

Here's a robust pattern, that allows for @@ and @ as well: /(?<=\S+[ ]{2,}@?.)("[^;\\n]+"|[^@!,.:-\s\d]+)|(?<=\S[ ]{2,}@?.("[^;\\n]+"|[^@!,.:-\s\d]+)(?=\s?-?\d+))/g

https://regex101.com/r/2X1aWc/1

— Reply to this email directly, view it on GitHub https://github.com/mhansen/hledger-vscode/issues/264#issuecomment-1336184272, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOJQFMUXX2ELMTS46ZTWLNSGXANCNFSM6AAAAAASNJUDH4 . You are receiving this because you commented.Message ID: @.***>

mhansen avatar Dec 03 '22 18:12 mhansen

Could you explain / break down your regex into parts with comments? It’s very long

I edited my reply to add an explanation: https://github.com/mhansen/hledger-vscode/issues/264#issuecomment-1336184272

the-solipsist avatar Dec 03 '22 18:12 the-solipsist