Kohei Watanabe

Results 81 issues of Kohei Watanabe

I am developing a website customizing Hugo Universal theme: https://quanteda.netlify.com/ The theme is amazing, but we noticed that the carouse and testimonial in both my website and the demo site...

If `tokens_compound()` is used, it should be recorded in the meta field, probably in `ngram`. ``` > attr(tokens_compound(tokens("aa bb cc"), phrase("aa bb")), "meta")$object $unit [1] "documents" $what [1] "word" $ngram...

Following the discussion on #2102, I created the [dev-skipgram2](https://github.com/quanteda/quanteda/blob/dev-skipgram2/src/tokens_compound_mt.cpp) branch to add `skip` to `tokens_compound()`. I managed to make it possible to generate skipgrams, but removing original tokens of compounds...

I am often frustrated because I can only segment documents into sentences on corpus, but I came up with an idea to make it possible on tokens with boundary marker....

We have `textstat_summary()` now so it is the time to make the `summary.corpus()` as simple as `base::summary()`. We should also lead `add_summary_metadata()` to retire.

documentation

Tokens are recompiled every time changes are made by `tokens_select()`, `tokens_subset()`, `tokens_compound()` etc. but it adds significant amount of execution time (20-30%). It is especially inefficient when multiple `tokens_select()` are...

performance

Users should stem tokens before forming ngrams, so we do not need `wordstem_ngrams()` anymore. https://github.com/quanteda/quanteda/blob/1d515f2d379647a873ed9b2c8c98504165c3bdb0/R/wordstem.R#L29-L48

It seems that users face problems when they work with ngrams: https://stackoverflow.com/questions/46685498/remove-ngrams-with-leading-and-trailing-stopwords Then, how about making a `ngram` wrapper similar to `phrase` that basically does this for users: ```r pattern...

enhancement
design

Cloning remote repo takes very long time because the size is 500MB now...

meta

We can slice the top level of dictionary objects with `[]`, but it is difficult do in nested levels. With a dictionary ```r dict

dictionary