Kohei Watanabe
Kohei Watanabe
created by @hirokokinoshita
"タイ*" for Thailand produces a lot of false matches. For example, "タイヤ" (tire), "タイム" (time), "タイミング" (timing), "タイプ" (type), "タイトル" (title), "タイガー" (tiger). This is a good reminder that we...
Let's imagine we need to make artificial paragraphs by combining 2 adjacent sentences, run some operation of them and restore original documents. This might be a rare case, but raises...
`word2vec.list()` works with with list of characters vectors. It serializes tokens character vectors to integer vectors in R. This can be done differently but this is the simplest approach. We...
@LungtaSEKI created a Turkish dictionary for newsmap.
Add function to delete cache files to avoid disks become full. It will be called `remove_cache(time = 3600 * 24 * 7)` or something and remove cache files older than...
Should all the beta be `NA` when there is not seed word?
If the files are broken, overwrite with new cache in `cache_glove()` and `cache_svd()`. It would be simply `try(readRDS(x))`.
The values are saved in the attribute of seed words to pass to `textmodel_lss()`.