ndarray icon indicating copy to clipboard operation
ndarray copied to clipboard

Views with non-fixed steps

Open nilgoyette opened this issue 6 years ago • 8 comments
trafficstars

I work with big 4D images, often iterating on Axis(3) to modify it or create a new image. In some cases, I need to iterate only on some indices:

let dwi = Array4::wathever // Loaded from disk
let dwi_indices = vec![2, 3, 10, 13, 17, 21 ...]; // Calculated from dwi
let mut output = Array4::zeros((dwi.dim().0, dwi.dim().1, dwi.dim().2, dwi_indices.len()));
Zip::from(&dwi_indices).and(output.axis_iter_mut(Axis(3))).apply(|&i, volume| {
    let dwi = self.dwi.index_axis(Axis(3), i);
    volume.assign(...);
});

This method works and I can keep it but is there a way to ask for a view with the right indices? select is nice but it makes a copy of the data. s! asks for a step, which I can't give because it's "random". Anything like let dwi = dwi.slice(s![.., .., .., &self.dwi_indices]);?

nilgoyette avatar Mar 19 '19 15:03 nilgoyette

I think this is roughly equivalent to array indexing, am I understanding it correctly?

LukeMathWalker avatar Mar 21 '19 20:03 LukeMathWalker

Yeah, well, numpy is immensely powerful and features like Indexing Multi-dimensional arrays are rarely used, I think. To make it short, yes.

nilgoyette avatar Mar 22 '19 14:03 nilgoyette

It would also make functions like sum, mean, etc. useful again. I have a use-case where I want the mean of [.., .., .., &[0, 1, 4, etc.]] and I'm forced to calculate it by hand even if the function exists in ndarray.

nilgoyette avatar Mar 28 '19 18:03 nilgoyette

A few thoughts:

  • We can't add support for variable strides directly to ArrayBase as it's currently defined because its current representation only includes a data buffer, pointer, shape, and strides. Additionally, all of the existing methods are currently implemented under the assumption that elements are accessed by moving the pointer according to the shape and strides.

    It's helpful to look at NumPy for comparison, since NumPy's arrays are so similar to ArrayBase. NumPy has the same limitation in that indexing with an "index array" creates a copy of the data in a new array, although it does allow assignment with array[index_array] = something since it overloads __setitem__(). ndarray doesn't provide an equivalent to this combined "(index with index array) + (assignment)" operation, but we could add a method that does this.

    As far as I'm aware, the closest feature to what you're looking for that NumPy provides is masked arrays, but that requires a boolean mask of the whole array instead of just a list of indices.

  • You could create a wrapper for ArrayBase that implements this functionality.

  • I know you'd like to avoid using ArrayBase::select(), but if you're iterating over the same indices frequently, it may make sense to use select to copy the data you're interested in into a new array and then work with that new array. Doing so would most likely speed up iteration by (1) placing the data closer together in memory and (2) avoiding the overhead of jumping between various indices.

  • I've been working on a new crate intended to replace ndarray::NdProducer that allows you to write things like:

    use ndarray::prelude::*;
    use nditer::{ArrayBaseExt, NdProducer};
    
    fn main() {
        let dwi = Array4::<u8>::zeros((5, 3, 7, 30)); // Loaded from disk
        let dwi_indices = array![2, 3, 10, 13, 17, 21]; // Calculated from dwi
        let output = dwi
            .producer()
            .select_indices_axis(Axis(3), &dwi_indices)
            .map(|x| x + 1)   // or whatever operation you want to perform
            .collect_array();
        println!("{}", output);
    }
    

    or

    use ndarray::prelude::*;
    use nditer::{ArrayBaseExt, NdProducer};
    
    fn main() {
        let dwi = Array4::<u8>::zeros((5, 3, 7, 30)); // Loaded from disk
        let dwi_indices = array![2, 3, 10, 13, 17, 21]; // Calculated from dwi
        let sum = dwi
            .producer()
            .select_indices_axis(Axis(3), &dwi_indices)
            .fold(0, |acc, x| acc + x);
        println!("{}", sum);
    }
    

    If you wanted to, you could use the result of select_indices_axes (which has type SelectIndicesAxis<ArrayBaseProducer<ViewRepr<&u8>, Ix4>> in this case) as a wrapper type around ArrayView instead of always immediately consuming it. (This type is cheap to Clone if you need to consume it multiple times.)

    It's not ready for production use yet, but it's getting there.

jturner314 avatar Mar 29 '19 01:03 jturner314

Related to this: do you think it'd be reasonable to relax the Copy bound on ArrayBase::select() down to just Clone?

abreis avatar Apr 18 '19 11:04 abreis

Yes, that is quite doable even without something like the new nditer crate. See #269.

jturner314 avatar Apr 19 '19 20:04 jturner314

@abreis it should be possible, it depends on stack actually. Copy was used since it's easier to implement efficiently. (Clone code needs to account for correct and safe behaviour during unwinding).

bluss avatar Sep 07 '19 20:09 bluss

I think this is the most relevant question to my discussion post #1050. Have any decisions been made about the best step forward? Some of the options proposed are:

  • Integrate this nditer crate into core
  • Implement masked arrays
  • Use select() a bunch. Although ideally I'd like to see a multi-axis select() to avoid copying the array for each dimensions.
  • I proposed here that we make a copyslice! macro that works like slice but copies the array and allows arbitrary indices.

multimeric avatar Aug 03 '21 15:08 multimeric