oneDAL
oneDAL copied to clipboard
KMeans Init Sparsity Support
Add sparsity support to KMeans Init and fix a few bugs in daal sparse kmeans++ init, onedal kmeans++ init, and kmeans infer. Specific changes planned or made in this PR.
- [x] Fix distance calculation for sparse data in daal KMeans++
- [x] Allow oneDAL Kmeans++ init to take n_trials same as daal and scikit-learn
- [x] Fix difference between daal Kmeans++ dense and sparse results
- [x] Implement KMeans init sparse support for CPU (just calling daal implementation - cpu)
- [x] Fix oneDAL KMeans sparse infer on GPU
- [x] Update Kmeans infer for sparse data to allow result options same as dense
I have verified that
- [x] Daal kmeans init results are same for sparse and dense data
- [x] oneDAL kmeans init results are same for sparse and dense data
- [ ] oneDAL kmeans init results are same on cpu and gpu
- Not same for dense data unless we compute initial centroids for dense GPU using cpu implementation