Nick Becker comments

Results 180 comments of


                                            Nick Becker

Does BorutaPy work with cuML RandomForestClassifier?

Thanks for linking that issue @Wuuzzaa ! @lindeberg25 , we'd love to learn more about your use case and performance impact of using cuML's Random Forest vs. scikit-learn's RF. Let's...

Results of umap-learn and UMAP by cuML are different

@hectorpatino have you confirmed that the data going into the UMAP calls is the same? The notebooks are slightly different. I recommend filing a [cuML issue](https://github.com/rapidsai/cuml/issues/) that includes a minimal,...

[FEA] Increase maximum characters in strings columns

This is incredibly exciting! More than any individual string operation, one of the most common pain points I see in workflows is the inability to bring strings along as a...

Q28 fails in automated nightly runs

I cannot consistently reproduce this (though others have seen it as well). There may be something subtle happening with the Naive Bayes classifier.

Q28 fails in automated nightly runs

Lost the logs from the failure in the automated nightly run, unfortunately. Could not reproduce this with 100 consecutive runs of Q28. Will be triggering a few long-running tests to...

[DISCUSSION] Consider increasing default host memory limit per dask-cuda-worker

Thanks for the bump John. Anecdotally, we find that the most effective setup includes setting the host memory limit as the maximum available system memory (`(free -m | awk '/^Mem:/{print...

[DISCUSSION] Consider increasing default host memory limit per dask-cuda-worker

> IMO, this would be too dangerous for a default. ... I agree with Peter here. What's most effective for a given workflow doesn't necessarily translate to what's most effective...

[BUG]: Error: Kernel dies when creating a random forest model with use_gpu=True.

@RumiAllbert are you also only experiencing this using WSL2?

[BUG]: Error: Kernel dies when creating a random forest model with use_gpu=True.

Can you provide details about your environment, library versions, and Python version?

[FEA] Reduce peak memory requirements of HDBSCAN fit with large min_samples

I believe this is actually driven by `min_samples` (which is by default None and set to the value of `min_cluster_size`), as this is what determines the minimum number of neighbors...