number-parser
number-parser copied to clipboard
Feature to handle individual digits in parse_number
So one of the use cases I think could be parsing phone numbers or zip codes. These might be written in the form of two three zero two five eight
etc with each of the digit spelled out. Using parse
would return space separated string 2 3 0 2 5 8
while parse_number
would give None
. Neither gives the wanted output 230258 (number)
. (Of-course the user can do some additional processing on parse
output which will work but having a feature it in the library itself might be better)
We can have a parameter in parse_number
say relaxed
which when set to true will build this number up as one large number.
It sounds interesting.
We could also use a parameter like join_delimiter
or something like that to choose how to join the followed numbers (that are followed omitting spaces, commas, etc).
Examples:
>>> parse('I have three numbers: one, two, three', join_delimiter='-')
'I have 3 numbers: 1-2-3'
>>> parse('I have three numbers: one, two, three', join_delimiter='')
'I have 3 numbers: 123'
>>> parse('I have three numbers: one, two, three', join_delimiter='/')
'I have 3 numbers: 1/2/3'
>>> parse('two three zero two five eight', join_delimiter='.')
2.3.0.2.5.8
@noviluni sir, this is what I am thinking what do you suggest?
From this code
myvalue = _build_number(tokens_taken, lang_data)
for each_number in myvalue:
current_sentence.append(each_number)
current_sentence.append(" ")
To this code
if tokens_taken:
myvalue = _build_number(tokens_taken, lang_data)
for each_number in myvalue:
if delimeter:
current_sentence.append(each_number)
current_sentence.append(delimeter)
else:
current_sentence.append(each_number)
current_sentence.append(" ")
Here, I have added a new parameter - delimiter into the parse
function
Links for the code
https://github.com/scrapinghub/number-parser/blob/dab1f31c2fef1cd7e9881564136312d96de86385/number_parser/parser.py#L308
https://github.com/scrapinghub/number-parser/blob/dab1f31c2fef1cd7e9881564136312d96de86385/number_parser/parser.py#L287