LSX
LSX copied to clipboard
Semi-supervised algorithm for document scaling
Add function to delete cache files to avoid disks become full. It will be called `remove_cache(time = 3600 * 24 * 7)` or something and remove cache files older than...
Should all the beta be `NA` when there is not seed word?
If the files are broken, overwrite with new cache in `cache_glove()` and `cache_svd()`. It would be simply `try(readRDS(x))`.
The values are saved in the attribute of seed words to pass to `textmodel_lss()`.
DFMs with text unit being sentence often become too large for regular laptop computer, so we have to reduce memory usage by off-memory matrix such as [**bigstatsr**](https://github.com/privefl/bigstatsr). `big_randomSVD()` seems very...
Unit tests needed for - [ ] seed values (character vector, named-numeric vector, dictionary, and invalid values) - [ ] `predict`, `coef` and `summary` methods