crystal
crystal copied to clipboard
RFC: E notation BigDecimal parser
Related to : #9580
This implements a first pass at this (not stack based parser, to avoid needing a stack, just a translation of the parse table into logic)
PR which is still a work in progress and would benefit from your inputs should we wish to progress with this
created with the productions below (syntax below works in http://hackingoff.com/compilers/ll-1-parser-generator)
Scientific -> OptionalSign Number
OptionalSign -> s |
Number -> Digits OptionalFractional OptionalExponent | Fractional OptionalExponent
OptionalFractional -> '.' OptionalDigits
Fractional -> '.' Digits
OptionalExponent -> e OptionalSign Digits |
OptionalDigits -> Digits |
Digits -> Digit RepeatDigit
RepeatDigit -> Digit |
Digit -> n
Out of curiosity, what are perf gains?
@sija things are slower at the moment:
require "benchmark"
Benchmark.ips do |x|
x.report("ex1 - long") do
BigDecimal.new("+0829756293456064502345702345.283457863284759365e-7364756823")
end
x.report("ex2 - long error") do
BigDecimal.new("138479783434.927347264629873647269348762e82347978e")
rescue
#
end
x.report("ex3 - short") do
BigDecimal.new("1.1")
end
x.report("ex3 - short error") do
BigDecimal.new("1.1-")
rescue
#
end
end
On current master:
MRc:crystal stephen$ ./bin/crystal run --release bdp.cr
ex1 - long 781.23k ( 1.28µs) (± 0.90%) 672B/op 4.19× slower
ex2 - long error 434.36k ( 2.30µs) (± 1.56%) 785B/op 7.54× slower
ex3 - short 3.27M (305.53ns) (± 1.15%) 464B/op fastest
ex3 - short error 578.83k ( 1.73µs) (± 1.59%) 464B/op 5.65× slower
this:
MRc:crystal stephen$ bin/crystal run --release bdp.cr
ex1 - long 398.11k ( 2.51µs) (± 1.12%) 1.2kB/op 4.99× slower
ex2 - long error 275.56k ( 3.63µs) (± 1.20%) 1.27kB/op 7.20× slower
ex3 - short 1.99M (503.74ns) (± 1.52%) 784B/op fastest
ex3 - short error 405.12k ( 2.47µs) (± 1.45%) 1.04kB/op 4.90× slower
Apologies for the long delay on this... I will review if still relevant and fix up PR if needed (else close)
Are there cases that this new parser can handle and the original parser can't?
@lbguilherme The original motivation was the current implementation seems somewhat ad-hoc and contained a number of parse bugs (see #9547) mainly around invalid notation. I fixed all the issues I could find in #9577 but thought maybe a refactored parser would be better. This one was created from a stricter grammar for E notation so potentially could lead to a more robust parser of the notation.
I see however that since this was created, #10641 has been discussed, so maybe these changes are not relevant anymore in the bigger picture of a rewrite of BigDecimal
.