opendylan doc: Update pygments syntax highlight for numbers

This issue is not caused by the project but is related to it. We need to update the regular expressions used by the Pygments library to take into account the following updates to Opendylan:

DEP#11 which allows the use of underscore characters between any two digits.
Double float numbers (not considered until now).

The solution could be discussed here until a PR is proposed to the Pygments project.

Current regexp:

https://github.com/pygments/pygments/blob/edef94d66c2d70f05a86ac6098a69ab253b8d946/pygments/lexers/dylan.py#L140

The expressions below are case insensitive.

Feb 04 '25 07:02 fraya

Include _ in binary numbers:

#b[01]+(?:_[01]+)*

Matches are shown highlighted

Or include white space or carriage return at the end

#b[01]+(?:_[01]+)*(\s+|\r)
#b[01]+(?:_[01]+)*$

Feb 07 '25 10:02 fraya

Include _ in octal numbers (similar to binary):

#o[0-7]+(?:_[0-7]+)*

or

#o[0-7]+(?:_[0-7]+)*(\s+|\r)

Feb 07 '25 10:02 fraya

Hexadecimal numbers test cases:

#xff
#xdead_beef
#xdead_beef_
#xb_e_e_f
#xbe__ef
#x_beef_
#xh
#x + 1
#xff,#xff, #xff
(#xff)

#x[0-9a-f]+(?:_[0-9a-f]+)*$

or

#x[0-9a-f]+(?:_[0-9a-f]+)*($|[^0-9a-f_])

Feb 14 '25 18:02 fraya

Nice. I'm not sure how Pygments invokes the regex, but just in case:

I assume it uses a case-insensitive test? It needs to work for #x, #X, and [A-F].
If it gives you more text than just the token it wants to match against, you might need to use something like this at the end instead of $: ($|[^0-9a-f_]) (More accurately, I guess it needs to check for $ or any delimiter characters (like comma, close paren, space) that could terminate the literal, but it doesn't have to be perfect so I don't know if it's worth enumerating those.)

Feb 14 '25 18:02 cgay

Yes, the regular expressions Pygments uses for Opendylan are case insensitive (as seen in this line https://github.com/pygments/pygments/blob/edef94d66c2d70f05a86ac6098a69ab253b8d946/pygments/lexers/dylan.py#L33). Just yesterday I added a comment at the beginning saying that the regular expressions below are case insensitive, (sorry I should have put that earlier).
The expression ($|[^0-9a-f_]) is more precise, although looking at other examples of hexadecimal regular expressions it seems that most of them use $. I don't see any problem with using yours, which is more precise. I've added more test cases to see the difference.

Feb 15 '25 09:02 fraya

Floating point number. Our current RE [-+]?(\d*\.\d+(e[-+]?\d+)?|\d+(\.\d*)?e[-+]?\d+) does not match some of the literal tests suite:

Mar 22 '25 10:03 fraya