janome
janome copied to clipboard
How about using @property instead of __getattr__ for Token class?
When using Janome, I expected Token
class returned from Tokenizer.tokenize
method has some attributes like surface
, and they are str. But static analyzers like mypy cannot recognize the existence of these attributes and assume them as Any.
from janome.tokenizer import Tokenizer
t = Tokenizer()
tokens = t.tokenize("メロスは激怒した")
for token in tokens:
print(token.surface) # <- Any
I think it comes from Token
class defining dynamic attributes using __getattr__
method here:
https://github.com/mocobeta/janome/blob/9d82248e0c0815e367b9604d83ef0de198e017bc/janome/tokenizer.py#L121-L139
And I think Python builtin @property
can do the same thing with current __getattr__
implementation and is more type friendly and more auto completion friendly. The code example is like below:
@property
def surface(self):
return self.node.surface
@property
def part_of_speech(self):
return self.extra[0] if self.extra else self.node.part_of_speech
Do you think that makes sense?