Kohei Watanabe comments

Results 164 comments of


                                            Kohei Watanabe

A function to change windows of quanteda functions

Thanks for the suggestion. I am planing to [redesigning kwic() functions](https://github.com/quanteda/quanteda/issues/1840). It is a good idea to move `window` to a printing function.

Is there a better way to remove ngrams containing a stopword?

Removing stopwords before forming ngrams is the best, but I would do post-selection like this: ```r txt

Is there a better way to remove ngrams containing a stopword?

My proposed approach to select or stem ngrams is applying tokens functions to types via `as.phrase()`. ``` r require(quanteda) #> Loading required package: quanteda #> Package version: 2.9.9000 #> Unicode...

Shrink the size of repo

I think know why it happened. Someone (could be me) committed and pushed large Rmd cache files automatically generated for vignette. If this is the case, we have to remove...

Shrink the size of repo

We can definitely remove files under `text2vec_cache`. Data files can also be move to online storage as RDS using `quanteda.corpora::download()`.

Shrink the size of repo

We should forget about our dirty past and start new life with existing files only. For that, we only to shallow clone and push to a new `quanteda/quanteda` repo. Old...

Add convert(x, to = "kerasR") functionality

We have `types.tokens()` to extract vocabulary, so we will just pass it to `tokens()` to index words in new texts in the same way.

Add convert(x, to = "kerasR") functionality

This is only for experimental purpose as it is untested and inefficient, but we could do ```r text1

Remove wordstem_ngrams()

I think there was not many functions on tokens when #127 was posted. Removal of `wordstem_ngrams()` is along with our recommendation for user to do whatever possible on tokens, before...

Remove wordstem_ngrams()

This is a redundant functionality because users can achieve the same if they combine other functions properly: apply `tokens_stem()` before `tokens_ngram()`.