TextBlob
TextBlob copied to clipboard
Feature: Could it be a much better way to analyze the text, with better Performance?
Could it be a much better way to analyze the text, with better Performance?
A few days ago I suggested to a library called 'spaCy' to implement this functionality, however, they didn't even stop for a moment.
Their Performace is bad, however, they are not using the "re" module that includes many useful searching/parsing methods
the idea is to create a function that yields a dictionary python object.
The object includes all the attributes of the specific token and can be found using specific index.
For Example:
text = 'Hello World'
blob = load(text)
for tok in blob
print(blob)
>>> {(0, 5): {'name': 'Hello', 'head': '', 'pos': ''}}
>>> {(6, 11): {'name': 'World', 'head': '', 'pos': ''}}
and it gets more interesting... any time the user requests to get the 'pos' attribute or 'head', the program would do the calculation, only for the specific token then add it to the dictionary, so it will save lots of time.
tok[(0,5)]['pos']
--> adds the 'pos' attribute of the specific token in key indexes (0,5)
to the 'pos' filed value.
By Tamir Globus