pint
pint copied to clipboard
"#" modifier cannot be used for magnitude formatting
Consider this:
import pint
ureg = pint.UnitRegistry()
x = 0.9999
q = x * ureg.meters
print(f"{x:#.2g}")
print(f"{q:#.2g}")
This outputs:
1.0
1e+03 millimeter
While I would expect:
1.0
1.0 meter
Being able to use something like format(value, f"#.{n}g"
is extremely useful, as it basically ensures value
will be printed out with n
significant digits.
The issue here is that pint strips all occurrences of the #
modifier here and here, as if it could only mean the intent to use .to_compact()
on a quantity. Thus, it never makes its way into mspec
, making it impossible to use for magnitude formatting.
I can see two solutions for this: either the modifier could be changed to something else (I've changed it to @
privately to keep working, but that would break backwards compatibility) or a more rigid ordering could be required - for example, specifying that the #
modifier must not preceed the magnitude format specification and/or it must preceed the unit format specification.
I think it might be easier to do the split into unit / magnitude formatters earlier:
mspec, uspec = split_quantity_format(spec)
then uspec
can be modified without affecting mspec
.
We just have to keep in mind that there are plans to add quantity formats (i.e. ones that know how to format both the magnitude and the units), so we shouldn't make that more difficult to support.
It appears that with #1419, this has now changed: a # in a format specifier continues to call to_compact, but a # in default_format
is now passed through for magnitude formatting, both regardless of location.
This unfortunately causes a bit of a nightmare for Decimal, as Python silently chooses between two decimal implementations, '#' is supported for _pydecimal but not for cdecimal, cdecimal does not give a useful error message in that failure, and so a default format string that previously called to_compact will now break on some systems, but not others.
@cgevans can you provide an example ?
Here, for example, '#' works explicitly but not by default:
import pint
ureg = pint.UnitRegistry()
q = ureg.Quantity(1000, "mm")
format(q, "#~P")
#[Out]# '1.0 m'
format(q, "")
#[Out]# '1000 millimeter'
ureg.default_format = "#~P"
format(q)
#[Out]# '1000 mm'
ureg.default_format = "~#P"
format(q)
#[Out]# '1000 mm'
ureg.default_format = "~P#"
format(q)
#[Out]# '1000 mm'
format(q, "#")
#[Out]# '1.0 m'
ureg.default_format = "#"
format(q)
#[Out]# '1000 millimeter'
format(q, "#")
#[Out]# '1.0 meter'
With decimal, you can see clearly that the default # is being passed through to the magnitude, while the explicit one calls to_compact. This additionally causes problems with python issue 46904.
import pint
from _pydecimal import Decimal as PyDecimal # Some systems use this for Decimal
from _decimal import Decimal as CDecimal # Others use this
ureg_py = pint.UnitRegistry(non_int_type=PyDecimal)
ureg_c = pint.UnitRegistry(non_int_type=CDecimal)
q_py = ureg_py("1000. mm")
q_c = ureg_c("1000. mm")
format(q_py) # No decimal.
#[Out]# '1000 millimeter'
format(q_c)
#[Out]# '1000 millimeter'
format(q_py, "#") # compact
#[Out]# '1.000 meter'
format(q_c, "#")
#[Out]# '1.000 meter'
ureg_py.default_format = "#"
ureg_c.default_format = "#"
format(q_py) # decimal point -> # is being passed to PyDecimal
#[Out]# '1000. millimeter'
format(q_py, "#") # decimal point and compact
#[Out]# '1.000 meter'
format(q_c) # CDecimal doesn't support #
#[Out]# => ValueError: invalid format string
upon further consideration, it might be a good idea to rethink our string formatting mini-language: at the moment, we allow to freely mix magnitude, unit, and quantity formatters, but inevitably there will be character conflicts.
As such, how about we change the mini-language to:
mspec[pint_spec]
where mspec
is the magnitude format and pint_spec
the unit / quantity format.
For example, with a mspec
of .03f#
and pint_spec
of ~D
this would become
.03f#[~D]
That way, the intent would be very clear, and parsing the format would also be very easy.
Just to be sure, though, I'm not insisting on this particular format: any format that has the same effect would be fine with me.