Victor Lafargue
Victor Lafargue
rerun tests
@adityak74 this is indeed the issue. Could work on a fix for this : https://github.com/rapidsai/cuml/pull/5971.
If you want to scale ideally on multiple GPUs, I would recommend using the `HashingVectorizer` as a replacement to the `CountVectorizer`. It should yield good results while being stateless /...
There is an alternative solution which is simply to reduce the value for the number of features (`n_features` default=2**20). I recommend using cuML's `HashingVectorizer` (instead of DaskML) for GPU support,...
Are we planning to fix spectral initialization already or should I open a PR to update the documentation regarding this limitation for now? cc @cjnolet @dantegd