Kenneth Benoit

Results 308 comments of Kenneth Benoit

This is a really good idea that would offer a solution for #180.

Agreed! Perhaps we could add a set of spacyr_options either via `option()` or something like `quanteda_options()` that define defaults that can be redefined the by the user, or reset prior...

Just experimented with this. A few comments on the branch. Since this is looking up the tokens from the language model using `Token.vector()`, we don't really need to do this...

How about ```r # works on a spacyr parsed object wordvectors_get.spacyr_parsed(x, model) # works on a named list of characters, such as from spacy_tokenize() wordvectors_get.list(x, model) ``` to return a...

Great idea! Should be pretty straightforward to implement.

We don't plan to provide tools for modifying or training language models, but if a user has custom language models, we agree that spacyr should allow these to be used....

Not a bad idea. @amatsuo maybe add: ```r spacy_tokenize(x, what = c("word", "sentence"), remove_numbers = FALSE, remove_punct = FALSE, remove_symbols = FALSE, remove_separators = TRUE, remove_twitter = FALSE, remove_hyphens =...

@cecilialee No, for training a new language model you would need to do that in Python using the spaCy instructions. We unlikely to add this facility to **spacyr** in the...

@aourednik that would be ```r devtools::install_github("quanteda/spacyr", ref = "tokenize-function") ```