Kohei Watanabe

Results 164 comments of Kohei Watanabe

Thanks for the suggestion. I am planing to [redesigning kwic() functions](https://github.com/quanteda/quanteda/issues/1840). It is a good idea to move `window` to a printing function.

Removing stopwords before forming ngrams is the best, but I would do post-selection like this: ```r txt

My proposed approach to select or stem ngrams is applying tokens functions to types via `as.phrase()`. ``` r require(quanteda) #> Loading required package: quanteda #> Package version: 2.9.9000 #> Unicode...

I think know why it happened. Someone (could be me) committed and pushed large Rmd cache files automatically generated for vignette. If this is the case, we have to remove...

We can definitely remove files under `text2vec_cache`. Data files can also be move to online storage as RDS using `quanteda.corpora::download()`.

We should forget about our dirty past and start new life with existing files only. For that, we only to shallow clone and push to a new `quanteda/quanteda` repo. Old...

We have `types.tokens()` to extract vocabulary, so we will just pass it to `tokens()` to index words in new texts in the same way.

This is only for experimental purpose as it is untested and inefficient, but we could do ```r text1

I think there was not many functions on tokens when #127 was posted. Removal of `wordstem_ngrams()` is along with our recommendation for user to do whatever possible on tokens, before...

This is a redundant functionality because users can achieve the same if they combine other functions properly: apply `tokens_stem()` before `tokens_ngram()`.