pydantic-avro icon indicating copy to clipboard operation
pydantic-avro copied to clipboard

add avro doc field of a record

Open sookyomkim opened this issue 1 year ago • 6 comments

add avro doc field of a record using model's docstring

sookyomkim avatar Jul 31 '23 07:07 sookyomkim

@timvancann You merged the branch where sharonyogev worked on to support pydantic v2 and force pushed. Is that a mistake? Or is there some other intent?

If you were thinking of merging together, please review this pr. 🙇

sookyomkim avatar Aug 09 '23 05:08 sookyomkim

@sookyomkim I expect I made a booboo somewhere. Removing the pydantic-v2 changes was not intended. These changes are back so feel free to rebase on main. After that we can approve and merge this one :)

timvancann avatar Aug 09 '23 07:08 timvancann

So I would like to give my 2 cents here. I am a bit hesitant to automatically add description based on docstring. What would happen for different docstring formats? Some people might add types and param description in docstring as well how would these then translate to decriptions in the avro file?

To still support description I would suggest perhaps:

  1. Add a dunder method to the basemodel for a model specific description ( like the docstring)
  2. For the individual Fields the native pydantic description could be used and supported

DaanRademaker avatar Aug 09 '23 07:08 DaanRademaker

Take a look at this pydantic code. pydantic takes the docstring of the child class or the docstring of the parent class (if the child class has no docstring), calls cleandoc() to remove the white space, and writes it as a description. This looks no different than the code I wrote. Except for the part about getting the docstring of the parent class.

Shouldn't that be enough? Do we need to worry about the format of the docstring? Isn't it enough to see it as text and just put it in?

Initially, I implemented it by using pydantic json schema's description, but the reason for using a direct docstring is that pydantic description takes the docstring of the parent class if the child class does not have a docstring. So, if you don't write a docstring in your model, it will take the docstring from AvroBase. This is it. pydantic v2 doesn't have this problem.

sookyomkim avatar Aug 09 '23 08:08 sookyomkim

Perhaps a good balance would be taking only the first section of a docstring, as the convention of what they should look like is fairly well defined: https://peps.python.org/pep-0257/#handling-docstring-indentation

class Complex(AvroBase):
    """Form a complex number
    and do other stuff too.

    Keyword arguments:
    real -- the real part (default 0.0)
    imag -- the imaginary part (default 0.0)
    """

doc = Complex.model_json_schema()['description']
split = doc.find('\n\n')
doc = doc[:split] if split != -1 else doc

doc == 'Form a complex number\nand do other stuff too.'

BeRT2me avatar Aug 28 '23 20:08 BeRT2me

I hadn't seen this PR until after I made mine, and originally closed it. But, because it adds the reverse Avro --> Pydantic as well, I've taken some ideas from here and re-opened it.

BeRT2me avatar Aug 28 '23 20:08 BeRT2me