Evaluate usefulness of pep-593 for structured configs
"This PEP introduces a mechanism to extend the type annotations from PEP 484 with arbitrary metadata."
Some ideas below. Note that this is all pseudo code for now.
Documentation annotation
Doc : Annotated
num: Doc[int, "A number of things"] = 8
Range annotation for valid input range.
NumRange(T, min, max) = Annotated[T, min, max]
StrRange(Choices) = Annotated[str, Choices]
# 0 <= num < 10
num : NumRange(int, 0, 10)
# 0.1 <= num < 0.2
num : NumRange(float, 0.1, 0.2)
# string must be one of "a", "b" or "c".
# Similar to current Enum support.
string : StrRange["a", "b", "c"] = "a"
Size unit annotation
Number can be converted from some common base unit or string representation to the specified unit.
Size[T] = Annotated[int, T]
# Static type error from string to int, but can work around with Any somehow
# Would convert "1kb" to 1024 (bytes as int).
bytes: Size[B] = "1kb"
# Would convert "1kb" to 1
bytes: Size[KB] = "1kb"
# Would convert to 0.001MB or to 0MB, depending on int or float. not sure.
bytes: Size[MB] = "1kb"
File annotations
Not sure about it, File (Or pathlib.Path) should be a properly supported primitive first. but:
must_be_a_file : File[str, IsFile]
must_be_a_dir : File[str, IsDir]
must_exist : File[str, Exist]
@till-varoquaux, would love to chat with you about this to see what is currently possible. I would like to start with the Doc annotation, and I saw multiple suggestion on the the relevant issue, but as far as I can tell many are not actually implemented and the few that are not really satisfactory.
I wanted to mention that the field function defined in the dataclasses module has a metadata keyword. The keyword takes a user-defined mapping as input; it is provided, provided "as a third-party extension mechanism".
@dataclass
class MySchema:
num: int = field(metadata={"doc": "number of trials", "range": [0, 10]})
Using the PEP 593 annotations feels "cooler" than defining metadata via the field function, but I wanted to mention using field as an alternative. This could be more flexible than using Annotated: while Annotated allows the user to attach a sequence of data to a dataclass attribute, dataclasses.field allows the user to attach a mapping of data to a dataclass attribute.
Attr classes support this too (not just dataclasses).
Not sure what the status is of this, and I haven't ever used omegaconf myself, but I found this issue from other issues and thought it might be relevant to mention https://github.com/annotated-types/annotated-types since it actually implements some of the proposals here and is currently being used by Pydantic.
Thanks @adriangb.