Bruce W. Lee

Results 25 comments of Bruce W. Lee

"Automated assessment of written text" is a great paper. A more recent example would be [`SOTA, non-neural 2016`](https://www.aclweb.org/anthology/W16-0502/) and ['SOTA, neural 2020'](https://www.aclweb.org/anthology/2020.bea-1.1). Readability assessment is traditionally a very handcrafted feature-dependent...

Furthermore, in an application like this: ``` `import spacy from ewiser.spacy.disambiguate import Disambiguator from spacy.language import Language import utils nlp = spacy.load("en_core_web_sm", disable=['parser', 'ner']) @Language.factory('wsd') def wsd_engine(nlp, name): return Disambiguator('ewiser/ewiser.semcor+wngt.pt',...

Thanks for the comment. I'll make the pull request with appropriate modifications.

> > > i have the same problem, help pls, i need to end my final qualifying work in one week :( > > > > > > Hi @whk6688...

Just another user passing by. UTF-8 and ascii don't work for sure. cp-1252 is your best bet. Delete WNL Scarlett file before you run code. But be minded that cp-1252...

On TextsSeparatedByReading Level, try this script. This will give the output files that you need. @mnbucher ```python from glob import glob import csv import os import pandas as pd import...

@mnbucher No. In my experience, the corpus size is the same as stated in the paper. In the above script, see `txt2csv` instead of `pairwise_txt2csv`. The latter creates pairwise instances.

This is interesting but what use is LLM if you are using word frequency to guess a word out two?

Thank you for open-sourcing this repo! It's helping a lot with my research. Regrading Stanza migration, unless you have a tight deadline, I could help. However, I doubt the accuracy...

Oh, so do you mean adding an option to use Stanza? Hmm, I'm familiar with both Stanza and spaCy, but the biggest trouble for me would be dealing with Spanish...