pydantic-avro
pydantic-avro copied to clipboard
add avro doc field of a record
add avro doc field of a record using model's docstring
@timvancann You merged the branch where sharonyogev worked on to support pydantic v2 and force pushed. Is that a mistake? Or is there some other intent?
If you were thinking of merging together, please review this pr. 🙇
@sookyomkim I expect I made a booboo somewhere. Removing the pydantic-v2 changes was not intended. These changes are back so feel free to rebase on main. After that we can approve and merge this one :)
So I would like to give my 2 cents here. I am a bit hesitant to automatically add description based on docstring. What would happen for different docstring formats? Some people might add types and param description in docstring as well how would these then translate to decriptions in the avro file?
To still support description I would suggest perhaps:
- Add a dunder method to the basemodel for a model specific description ( like the docstring)
- For the individual Fields the native pydantic description could be used and supported
Take a look at this pydantic code. pydantic takes the docstring of the child class or the docstring of the parent class (if the child class has no docstring), calls cleandoc() to remove the white space, and writes it as a description. This looks no different than the code I wrote. Except for the part about getting the docstring of the parent class.
Shouldn't that be enough? Do we need to worry about the format of the docstring? Isn't it enough to see it as text and just put it in?
Initially, I implemented it by using pydantic json schema's description, but the reason for using a direct docstring is that pydantic description takes the docstring of the parent class if the child class does not have a docstring. So, if you don't write a docstring in your model, it will take the docstring from AvroBase. This is it. pydantic v2 doesn't have this problem.
Perhaps a good balance would be taking only the first section of a docstring, as the convention of what they should look like is fairly well defined: https://peps.python.org/pep-0257/#handling-docstring-indentation
class Complex(AvroBase):
"""Form a complex number
and do other stuff too.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
doc = Complex.model_json_schema()['description']
split = doc.find('\n\n')
doc = doc[:split] if split != -1 else doc
doc == 'Form a complex number\nand do other stuff too.'
I hadn't seen this PR until after I made mine, and originally closed it. But, because it adds the reverse Avro --> Pydantic as well, I've taken some ideas from here and re-opened it.