Kenneth Benoit comments

Results 308 comments of


                                            Kenneth Benoit

Add spacy_install_langmodel()

This is a really good idea that would offer a solution for #180.

Error in py_run_file_impl(file, local, convert) : ModuleNotFoundError: No module named 'spacy'

Closing but if you are still having trouble, add a note.

Provide option to change nlp.max_length

Agreed! Perhaps we could add a set of spacyr_options either via `option()` or something like `quanteda_options()` that define defaults that can be redefined the by the user, or reset prior...

Incorporation of pre-trained word embeddings functionality

Just experimented with this. A few comments on the branch. Since this is looking up the tokens from the language model using `Token.vector()`, we don't really need to do this...

Incorporation of pre-trained word embeddings functionality

How about ```r # works on a spacyr parsed object wordvectors_get.spacyr_parsed(x, model) # works on a named list of characters, such as from spacy_tokenize() wordvectors_get.list(x, model) ``` to return a...

Allow local models to be used in spacy_initialize()

Great idea! Should be pretty straightforward to implement.

Allow local models to be used in spacy_initialize()

We don't plan to provide tools for modifying or training language models, but if a user has custom language models, we agree that spacyr should allow these to be used....

spacyr wishlist

Not a bad idea. @amatsuo maybe add: ```r spacy_tokenize(x, what = c("word", "sentence"), remove_numbers = FALSE, remove_punct = FALSE, remove_symbols = FALSE, remove_separators = TRUE, remove_twitter = FALSE, remove_hyphens =...

spacyr wishlist

@cecilialee No, for training a new language model you would need to do that in Python using the spaCy instructions. We unlikely to add this facility to **spacyr** in the...

spacyr wishlist

@aourednik that would be ```r devtools::install_github("quanteda/spacyr", ref = "tokenize-function") ```