msgspec icon indicating copy to clipboard operation
msgspec copied to clipboard

Handling `float` strings

Open umarbutler opened this issue 4 months ago • 1 comments

Question

I like to use msgspec to parse YAML configs for training models. I find that when I define a struct to have a parameter that is a float and then, in my YAML config, I write a value like 0.00005 in scientific notation as 5e-5, I end up getting the error msgspec.ValidationError: Expected float, got str- at$.my_float_parameter`.

What is the best way to avoid that error without having to convert every float parameter into a float after an object is parsed?

umarbutler avatar Aug 23 '25 06:08 umarbutler

This is not a msgspec issue but a PyYaml one.

Basically all that msgspec does for yaml is;

  1. use pyyaml to parse the yaml
  2. use msgspec convert to convert the output of pyyaml to the desired type

Related piece of code: https://github.com/jcrist/msgspec/blob/0.19.0/msgspec/yaml.py#L136-L192

The issue is that PyYaml interprets 5e-5 as a string and therefore returns it as a string for msgspec before conversion. This happens because PyYaml is a YAML 1.1 parser and it does not (currently?) support YAML 1.2, which would treat 5e-5 as a float.

For current PyYaml (YAML 1.1) these values would work and would end up being floats (few examples);

  • 5.0e-5 (this one is a valid float in YAML 1.1)
  • !!float 5e-5 (this is explicitly tagged as a float)
  • and more...

Examples of what you can do for a workaround;

  • Use strict=False to coerce the "5e-5" string coming from PyYaml into a float during convert on msgspec's side
  • Replace PyYaml with a YAML 1.2 compatible parser.

floxay avatar Sep 03 '25 19:09 floxay