Eric Kafe
Eric Kafe
Before getting too alarmed, we may want to wait for a sober analysis of this vulnerability, bearing in mind that it has been known for several years, without any known...
> Now that it has a public NIST CVE, people will exploit it. This isn't something you "wait" on. This is something you address right away... @nicolaschaillan, before even starting...
I inspected a typical pickle in each series, looking at their type() and eventual \_\_dict\_\_. The _tagsets_ and _averaged\_perceptron\_tagger_ packages contain only simple data structures, and can easily be translated...
@alvations, it is great to hear that you believe in the "deep clean" solution, where the pickles are completely removed forever. There is also excellent news from gpt-4o, suggesting to...
> joblib uses pickle under the hood as well, so this wouldn't solve the issue. Thanks @Dunedan! Instead, there seems to be a possibility to translate the pickles into Protobuf...
The old _punkt_ package is deprecated. Recent NLTK versions use _punkt_tab_ instead.
Thanks again @Sion1225, your contributions are very welcome! To accomodate ML, I guess that support for multiprocessing is actually needed, but it could eventually wait for a future PR. What...
It is also possible to shorten the function even further. Some professors don't appreciate this concise style, but it tends to be popular in nltk. In my opinion, concise code...
@Sion1225, there is no reason to worry about your dictionary, because it is equivalent to the following shorter helper function, which can be defined inside the lemmatize_text() function: ``` def...
I have started to measure the accuracy of this approach against the gold standard tags in the whole Brown Corpus, but I still need to tweak the tag2pos() function in...