polars
polars copied to clipboard
`Series.set_at_idx` requires NumPy
Originally discussed in #3890
Polars no longer has NumPy as a dependency. But interop with NumPy obviously still requires NumPy to be installed.
There is one more case where NumPy is required, which is for Series.set_at_idx, and by extension, Series.__setitem__. It would be great if we can adjust this function to work without NumPy.
Ritchie's left some pointers for this in the original thread:
We must update the function at the rust side to accept a
Series. That series should then be cast on toIdxSizeand checked for null values. If null values present we should throw aValueError. We could update the macro. Instead ofidx: &[u32]we can accept aPySeries.macro_rules! impl_set_at_idx { ($name:ident, $native:ty, $cast:ident, $variant:ident) => { fn $name(series: &Series, idx: &[u32], value: Option<$native>) -> Result<Series> { let ca = series.$cast()?; let new = ca.set_at_idx(idx.iter().map(|idx| *idx as usize), value)?; Ok(new.into_series()) } #[pymethods] impl PySeries { pub fn $name(&self, idx: &PyArray1<u32>, value: Option<$native>) -> PyResult<Self> { let idx = unsafe { idx.as_slice().unwrap() }; let series = $name(&self.series, &idx, value).map_err(PyPolarsErr::from)?; Ok(Self::new(series)) } } }; }Eventually I want to remove this macro and re-implement
settingwith our new mutable API and proper python conversions, but that is not an easy "first issue", so I'd recommend to first make it work with the old approach.Originally posted by @ritchie46 in https://github.com/pola-rs/polars/issues/3890#issuecomment-1175984672
For arrays backed by numeric types (numbers and logical types like dates), this has been done in #4121.
The changes in the dispatch to the implementation is another pointer of how we should achieve this for the other datatypes. The FFI dictionary lookup should not be used anymore. That's an archaic solution dating from the time I did not understand the pyo3 lib good enough.
scatter no longer uses numpy