janome icon indicating copy to clipboard operation
janome copied to clipboard

How about using @property instead of __getattr__ for Token class?

Open otariidae opened this issue 9 months ago • 0 comments

When using Janome, I expected Token class returned from Tokenizer.tokenize method has some attributes like surface, and they are str. But static analyzers like mypy cannot recognize the existence of these attributes and assume them as Any.

from janome.tokenizer import Tokenizer

t = Tokenizer()
tokens = t.tokenize("メロスは激怒した")

for token in tokens:
  print(token.surface)  # <- Any

I think it comes from Token class defining dynamic attributes using __getattr__ method here:

https://github.com/mocobeta/janome/blob/9d82248e0c0815e367b9604d83ef0de198e017bc/janome/tokenizer.py#L121-L139

And I think Python builtin @property can do the same thing with current __getattr__ implementation and is more type friendly and more auto completion friendly. The code example is like below:

    @property
    def surface(self):
        return self.node.surface

    @property
    def part_of_speech(self):
        return self.extra[0] if self.extra else self.node.part_of_speech

Do you think that makes sense?

otariidae avatar Nov 25 '23 11:11 otariidae