Currency symbols don't get highlighted if not followed by a number
In this example, I would expect both instances of ₹ to get highlighted, but only the one followed by the number gets highlighted. So negative numbers with the minus sign after the currency symbol, or currency symbols following numbers don't get highlighted.

It's probably the word boundary "\b" at the end of the 'currencies' regular expression . I think I mostly expect currency symbols to be after the number.
I've added a test case to reproduce this. Not sure how to fix it easily though. You're welcome to have a go at it?
On Mon, 28 Nov 2022 at 23:55, Pranesh Prakash @.***> wrote:
In this example, I would expect both instances of ₹ to get highlighted, but only the one followed by the number gets highlighted. So negative numbers with the minus sign after the currency symbol, or currency symbols following numbers don't get highlighted. [image: image] https://user-images.githubusercontent.com/3080824/204282726-35c61df7-9bc0-404b-8041-032b06e41921.png
— Reply to this email directly, view it on GitHub https://github.com/mhansen/hledger-vscode/issues/264, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOKCDD36YTWMLPIVK43WKSTUFANCNFSM6AAAAAASNJUDH4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Here's a robust pattern, that allows for rates in the form of @@/ @ and comments using ;/#/* as well. It tries to follow hledger's logic as far as I can tell.
In .NET/C#:
(?<=\S+[ ]{2,}@?[^;#*\n]*)(""[^;\\\n]+""|[^-+.,@*;\t\n ""{}=\d]+)
in JS:
(?<=\S+[ ]{2,}@?[^;#*\n]*)("[^;\\\n]+"|[^-+.,@*;\t\n "{}=\d]+)
Explanation:
- Lookbehind
(?<=…)to check if there's a. one ore more non-whitespace characters\S+b. followed by at least 2 blank spaces[ ]{2,}c. optionally followed by an @ symbol@?d. followed by zero or more characters that aren't comment markers[^;#*\n]* - Then match either:
a. quoted text without prohibited characters:
"[^;\\\n]+"b. or (|) simple commodity symbol without any digits, or other prohibited characters:[^ -+.,@*;\t\n"{}=\d]+
I tried the JS version on my journal in vscode, via the find bar, it works within a split second. I'm not sure how to time it to the milliseconds.
You could translate it to the appropriate regex engine.
https://regex101.com/r/fGhYrY/6
Hmm. There is a philosophical thing here, should we match just known allowed currencies or any string in the right place? Matching know currencies would catch more typos, which is one thing that syntax highlighting is useful for. Conversely, hledger accepts arbitrary currencies, I suppose.
Could you explain / break down your regex into parts with comments? It’s very long
On Sun, 4 Dec 2022 at 02:36, Pranesh Prakash @.***> wrote:
Here's a robust pattern, that allows for @@ and @ as well: /(?<=\S+[ ]{2,}@?.)("[^;\\n]+"|[^@!,.:-\s\d]+)|(?<=\S[ ]{2,}@?.("[^;\\n]+"|[^@!,.:-\s\d]+)(?=\s?-?\d+))/g
https://regex101.com/r/2X1aWc/1
— Reply to this email directly, view it on GitHub https://github.com/mhansen/hledger-vscode/issues/264#issuecomment-1336184272, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZYOJQFMUXX2ELMTS46ZTWLNSGXANCNFSM6AAAAAASNJUDH4 . You are receiving this because you commented.Message ID: @.***>
Could you explain / break down your regex into parts with comments? It’s very long
I edited my reply to add an explanation: https://github.com/mhansen/hledger-vscode/issues/264#issuecomment-1336184272