cuml
cuml copied to clipboard
[BUG] Dask PCA
Describe the bug The return value element of transform has no shape and cant be integrated into a existing data structure without calling PCA.compute_chuch_sizes() because its has no size.
Steps/Code to reproduce bug
from dask_cuda import LocalCUDACluster
from dask.distributed import Client, wait
import cupy as cp
from cuml.dask.decomposition import PCA
from cuml.dask.datasets import make_blobs
cluster = LocalCUDACluster(threads_per_worker=1)
client = Client(cluster)
nrows = 6
ncols = 3
n_parts = 2
X_cudf, _ = make_blobs(n_samples=nrows, n_features=ncols,
centers=1, n_parts=n_parts,
cluster_std=0.01, random_state=10,
dtype=cp.float32)
cumlModel = PCA(n_components = 1, whiten=False)
XT = cumlModel.fit_transform(X_cudf)
print(XT.shape)
Expected behavior A clear and concise description of what you expected to happen.
Environment details (please complete the following information):
- Environment location: Bare-metal
- Linux Distro/Architecture: Ubuntu 22.04 amd64
- GPU Model/Driver: [3090 and driver 550.54.15]
- CUDA: [11.8]
@Intron7 thanks for reporting this! I'll try to repro and find a fix as soon as we can.
@dantegd I have an easy workaround for this #5555 is way more important imo.