numbers_in_words
numbers_in_words copied to clipboard
Unexpected number parsing
Hi!
First of all thanks for this cool gem. I needed exactly this, something that can parse a numeric code - how a human would say it - to a number.
I did notice some quirks of the number parsing in my application:
2.7.4 :001 > NumbersInWords.in_numbers 'one two three four'
=> 1234
2.7.4 :002 > NumbersInWords.in_numbers 'one two three four.'
=> 1234
2.7.4 :003 > NumbersInWords.in_numbers '1234.'
=> 12034
2.7.4 :004 > NumbersInWords.in_numbers '1234'
=> 1234.0
Maybe it's not designed to also recognize numeric values input, but as I'm using it in combination with speech recognition that's sometimes what I get.
For now I'm simply checking this myself and not pass these values into NumbersInWords
:
def parse_numeric_value(param)
return if param.nil?
cleansed_params = param.strip.chomp('.')
if /^\d+$/.match? cleansed_params
cleansed_params
elsif cleansed_params
NumbersInWords.in_numbers(cleansed_params).to_s
end
end
How unexpected! I'm not sure what's going on there. I can take a look at some point, but will accept a PR too if you want to have a go. I suppose any solution would need a clearly defined set of expected behaviours and exhibited problem behaviours though rather than just solving a subset of simple cases.
I'm not sure what may be involved in supporting this type of input though.
It looks like for your single specific example ruby's inbuilt String#to_f
does the job.
Hey @markburns I came across another one:
2.7.5 :009 > NumbersInWords.in_numbers 'Zero 605.'
=> 5
2.7.5 :010 > NumbersInWords.in_numbers 'Zero 605'
=> 0
Again my use case might be a bit different than others, I'm basically trying to parse a 4-8 digit pin where I receive the "said" value from speech recognition (so you could argue the speech recognition should be better and I agree 😞 )
I am seeing a few problems here:
- period at the end making a weird difference (same as original issue, but I can crop that out easily beforehand)
- leading 0 will get lost which sucks but is expected as we're talking numbers, i.e.
605
is the same as0605
in a numeric sense but not for my 4 digit codes (could potentially fill it up with leading zeros) - the leading "Zero" causes the rest to get lost
I will try to avoid codes with leading 0 and also try to wrap NumbersInWords
in some more edge case logic (though I'm not sure I'd want to upstream that into the gem as it seems a bit specific).
It's almost as if for number parsing we need some kind of mode, e.g. this would be a "PIN mode" where we return a string formatted number as it can have leading 0 and we know there shouldn't be any decimals either.
Thanks for the extra info. I think one issue you're having here with leading zeros is considering these objects as "numbers".
It sounds like they're not but share a lot in common with numbers.
People have similar issues if they try to process credit card numbers as numeric values too.
I think that aspect of your problem is certainly outside the scope of this project as the way the words are parsed does require treating them mathematically as numbers.
It does sound like there's value in providing a layer perhaps on top of this gem that could solve some of the common problems people encounter when they work with voice input.
I think it could also make sense to extract some of the work @dimidd added related to parsing pairs of numbers like 1920 (nineteen twenty).
Then that would keep the core of this gem simpler and allow for adding more of the fuzzy stuff in an outer layer.
Perhaps the outputs of this proposed gem would provide a value object class for things like credit card numbers and pins.
They're a bit like numbers and a bit like strings and you'd probably be the best person to determine what the API should look like on these value objects.
I'd be happy to brainstorm a little further sometime if you're interested