cuml icon indicating copy to clipboard operation
cuml copied to clipboard

[BUG] Dask PCA

Open Intron7 opened this issue 9 months ago • 2 comments

Describe the bug The return value element of transform has no shape and cant be integrated into a existing data structure without calling PCA.compute_chuch_sizes() because its has no size.

Steps/Code to reproduce bug

from dask_cuda import LocalCUDACluster
from dask.distributed import Client, wait
import cupy as cp
from cuml.dask.decomposition import PCA
from cuml.dask.datasets import make_blobs

cluster = LocalCUDACluster(threads_per_worker=1)
client = Client(cluster)

nrows = 6
ncols = 3
n_parts = 2

X_cudf, _ = make_blobs(n_samples=nrows, n_features=ncols,
                       centers=1, n_parts=n_parts,
                       cluster_std=0.01, random_state=10,
                       dtype=cp.float32)


cumlModel = PCA(n_components = 1, whiten=False)
XT = cumlModel.fit_transform(X_cudf)
print(XT.shape)

Expected behavior A clear and concise description of what you expected to happen.

Environment details (please complete the following information):

  • Environment location: Bare-metal
  • Linux Distro/Architecture: Ubuntu 22.04 amd64
  • GPU Model/Driver: [3090 and driver 550.54.15]
  • CUDA: [11.8]

Intron7 avatar May 03 '24 14:05 Intron7

@Intron7 thanks for reporting this! I'll try to repro and find a fix as soon as we can.

dantegd avatar May 03 '24 17:05 dantegd

@dantegd I have an easy workaround for this #5555 is way more important imo.

Intron7 avatar May 03 '24 18:05 Intron7