Handling `float` strings
Question
I like to use msgspec to parse YAML configs for training models. I find that when I define a struct to have a parameter that is a float and then, in my YAML config, I write a value like 0.00005 in scientific notation as 5e-5, I end up getting the error msgspec.ValidationError: Expected float, got str- at$.my_float_parameter`.
What is the best way to avoid that error without having to convert every float parameter into a float after an object is parsed?
This is not a msgspec issue but a PyYaml one.
Basically all that msgspec does for yaml is;
- use pyyaml to parse the yaml
- use msgspec
convertto convert the output of pyyaml to the desired type
Related piece of code: https://github.com/jcrist/msgspec/blob/0.19.0/msgspec/yaml.py#L136-L192
The issue is that PyYaml interprets 5e-5 as a string and therefore returns it as a string for msgspec before conversion.
This happens because PyYaml is a YAML 1.1 parser and it does not (currently?) support YAML 1.2, which would treat 5e-5 as a float.
For current PyYaml (YAML 1.1) these values would work and would end up being floats (few examples);
-
5.0e-5(this one is a valid float in YAML 1.1) -
!!float 5e-5(this is explicitly tagged as a float) - and more...
Examples of what you can do for a workaround;
- Use
strict=Falseto coerce the"5e-5"string coming from PyYaml into a float duringconverton msgspec's side - Replace PyYaml with a YAML 1.2 compatible parser.