ehrapy icon indicating copy to clipboard operation
ehrapy copied to clipboard

Compatibility of ep.pp.knn_impute with different array types

Open eroell opened this issue 1 year ago • 2 comments

Description of feature

This is a function that operates on .X or .layers so 3D and array-type and encoding is relevant.

3D If 3D allowed:

  • [ ] function handles 3D data
  • [ ] the test of the function tests both the expected 2D and 3D implementation

If 3D not allowed:

  • [ ] only_2D decorator
  • [ ] the test of the function tests for failure if 3D is passe

array-type The available array types are np.array, dask.array, and scipy.sparse matrices.

  • [ ] function is single-dispatched, with potential not_implemented errors being raised
  • [ ] test is parametrized to test the different array-types

eroell avatar Jan 08 '25 22:01 eroell

At a first glance, daskml does not have a K-neighbors imputer like sklearn.

Having this here would be a significant addition. This is a non-trivial effort, and some engineering is required for this scaling improvement.

eroell avatar Jan 08 '25 22:01 eroell

This might also require work in https://github.com/Zethson/fknni/ -> let me know if one of you needs access and wants to work on that.

Zethson avatar Sep 25 '25 16:09 Zethson