Thomas Li
Thomas Li
Planning on implementing this as part of porting the parquet reader to pylibcudf
Can you try using the cudf.pandas.profile magic? https://docs.rapids.ai/api/cudf/stable/cudf_pandas/usage/#understanding-performance-the-cudf-pandas-profiler I think this should tell you which operations are running on the GPU and which are running on CPU.
Thanks for the reviews! I'll probably put this back in draft while your cudf.polars are getting merged. I think I'm already hitting some conflicts with your PRs, so probably best...
This needs to wait until compat with polars 1.1 is achieved.
While LDA was a bit hard to port over to dask, PCA worked perfectly out of the box! (which I think is a pretty big win for the array API)...
R.e. performance **Testing out of core** for dask generated by dask-ml with parameters ``` n_samples=100_000 n_classes=2 n_informative=5 ``` on a Gitpod machine with 2 cores and 8 GB RAM, I...
FYI, array-api-compat fixes are ongoing here https://github.com/data-apis/array-api-compat/pull/110
Since array-api-compat 1.5.1 came out and CI is green here, I'm going to be marking this PR as ready for review. The only other change I'm planning right now is...
> > The only other change I'm planning right now is splitting out the LDA changes, since that requires a patch to dask itself. > > Sounds good, however in...