Tamas Bela Feher comments

Results 82 comments of


                                            Tamas Bela Feher

Optimize euclidean distance in host refine phase

/ok to test 2de39ef

Optimize euclidean distance in host refine phase

/ok to test 18fe20f

Optimize euclidean distance in host refine phase

/merge

Add option to enable "sve" optimization level on armv9

/ok to test ca55b77

chore: Bump Rollup

I think it is not exactly the same issue. In the example above, I carefully create the input data so that we have a single partition per worker. There is...

Yes, it is row major, [as expected by the c++ layer](https://github.com/rapidsai/cuml/blob/6342914f3ad8460da70c2b53cdedd84657531a0b/python/cuml/cluster/kmeans_mg.pyx#L114) ``` print(dataset_gpu.blocks.ravel()[0].compute().flags) C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True ```

chore: Bump Rollup

The memory duplication seems to happen just around the time when the clustering is finishes. Here is a time trace of the memory usage ``` nvidia-smi -i 0 --query-gpu=index,timestamp,memory.used --format=csv...

chore: Bump Rollup

Thansk @dantegd for the fix! After this I believe there is still a duplication happening as I [describe above](https://github.com/rapidsai/cuml/issues/5936#issuecomment-2168846570). We have a suspicion that this is related to how we...

chore: Bump Rollup

@achirkin, above you write: > predict enter: free = 17011.441664 MB #

chore: Bump Rollup

> Indeed, in this case it allocates another copy of the dataset on GPU for a short time during call to `predict`, but Dante [has already added the fix you...