quantulum3 icon indicating copy to clipboard operation
quantulum3 copied to clipboard

Does not segment units without numbers

Open averbitsky opened this issue 2 years ago • 1 comments

Describe the bug The quantulum3 Python library does not correctly segment units without numbers. It should be able to parse the unit with or without a number. When parsing a string with a unit but without a number, it returns an empty list. When parsing a string with a number and a unit but with incorrect formatting, it returns an incorrect result. The library only returns the expected result when parsing a string with the correct number and unit formatting.

To Reproduce

  1. Run the following code:
from quantulum3 import parser
quants = parser.parse('intake (g/day)')
print(quants)

Observe that the output is an empty list [].

  1. Run the following code:
quants = parser.parse('intake 2 (g/day)')
print(quants)

Observe that the output is incorrect: [Quantity(2, "Unit(name="dimensionless", entity=Entity("dimensionless"), uri=Dimensionless_quantity)")].

  1. Run the following code:
quants = parser.parse('intake (2 g/day)')
print(quants)

Observe that the output is as expected: [Quantity(2, "Unit(name="gram per day", entity=Entity("mass flow"), uri=None)")].

Expected behavior The library should be able to parse units without numbers and return the correct result. In the first example, the expected output should be a quantity with the correct unit (e.g., gram per day) but without a specified number.

Screenshots N/A

Additional information: Python Version: 3.8.16 Classifier activated/ sklearn installed: Yes OS: macOS Device: Mac Apple M1 Pro quantulum3 Version: 0.8.1

Additional context This issue occurs when trying to parse units without numbers or with incorrect formatting. The library should be more flexible and robust in handling such inputs.

averbitsky avatar Apr 10 '23 19:04 averbitsky

quants = parser.parse('intake 2 (g/day)')

This looks like an interesting case that we might want to handle correctly. Do you have a source document where you took this from?

quants = parser.parse('intake (g/day)')

This is currently not covered by quantulum and would be a bigger change. Quantulum currently focuses on numbers and if possible tries to attach a unit to it to create a quantity. What you are proposing is that also units without numbers could be returned. This is certainly interesting.

nielstron avatar May 09 '23 11:05 nielstron