TextBlob icon indicating copy to clipboard operation
TextBlob copied to clipboard

Feature: Could it be a much better way to analyze the text, with better Performance?

Open ghost opened this issue 5 years ago • 0 comments

Could it be a much better way to analyze the text, with better Performance?

A few days ago I suggested to a library called 'spaCy' to implement this functionality, however, they didn't even stop for a moment.

Their Performace is bad, however, they are not using the "re" module that includes many useful searching/parsing methods

the idea is to create a function that yields a dictionary python object.

The object includes all the attributes of the specific token and can be found using specific index.

For Example:

text = 'Hello World'

blob = load(text)
for tok in blob
    print(blob)

>>> {(0, 5): {'name': 'Hello', 'head': '', 'pos': ''}}
>>> {(6, 11): {'name': 'World', 'head': '', 'pos': ''}}

and it gets more interesting... any time the user requests to get the 'pos' attribute or 'head', the program would do the calculation, only for the specific token then add it to the dictionary, so it will save lots of time.

tok[(0,5)]['pos']
--> adds the 'pos' attribute of the specific token in key indexes (0,5) 
to the 'pos' filed value.

By Tamir Globus

ghost avatar Oct 13 '19 18:10 ghost