ndarray icon indicating copy to clipboard operation
ndarray copied to clipboard

Are we ndarray yet?

Open LukeMathWalker opened this issue 6 years ago • 13 comments
trafficstars

Purpose

The idea behind this collection is to provide an index to easily navigate all currently open ndarray's issues which are immediately actionable. This is meant to be a good starting point for new contributors (e.g. what should I work on?) and it can also help existing contributors to identify trends and hot areas. I have pinned it using GitHub's new feature, so that it doesn't get lost (and stale).

Given that we have ~100 open issues (and more are opened every day), you are very welcome contributing to this taxonomy effort either commenting on this issue or editing it directly (if you have permissions to do so). I am only adding to this tracker things I can easily understand/where enough context is provided in the issue - if I left something along the way, feel free to add it and to provide more info on it.

New functionality

Documentation

  • [ ] Guidelines on how to use ndarray's types in a public API (Similar to Vec<T> vs &[T] considerations)

Feature parity

  • [ ] Equivalent of numpy.where or numpy.nonzero (Issue: https://github.com/rust-ndarray/ndarray/issues/466)
  • [ ] Kronecker product (or tensor product) (Reference: np.kron) (Issue: https://github.com/rust-ndarray/ndarray/issues/652)(Ongoing PR: #690)
  • [x] Scalar versions of standard deviation and variance (Issue: https://github.com/rust-ndarray/ndarray/issues/655)
  • [x] Add dstack, vstack, and hstack (Issue: https://github.com/rust-ndarray/ndarray/issues/667)
  • [ ] Sorting (Issue: https://github.com/rust-ndarray/ndarray/issues/195)

Interop / Finer-grained control

  • [x] ~Implement ascontiguousarray() or contiguous() method (Issue: https://github.com/rust-ndarray/ndarray/issues/532)~
  • [ ] Add shrink_to_fit method (Issue: https://github.com/rust-ndarray/ndarray/issues/427)

Ergonomics

  • [x] Implement multislice_axis! macro (Issue: https://github.com/rust-ndarray/ndarray/issues/593)
  • [x] ~New constructor method for 2D arrays from an iterator of 1D arrays/vectors (Issue: https://github.com/rust-ndarray/ndarray/issues/539)~ (https://github.com/rust-ndarray/ndarray/issues/609)
  • [ ] ArrawView as custom Dynamically Sized Type (Issue: https://github.com/rust-ndarray/ndarray/issues/538)
  • [x] Use #[track_caller] to improve panic info #972

Quality of life

  • [ ] Implement proptest's Arbitrary trait for Array (Issue: https://github.com/rust-ndarray/ndarray/issues/596)
  • [x] Add new type aliases: ArcArray1 and ArcArray2 (Issue: https://github.com/rust-ndarray/ndarray/issues/661)
  • [x] ~Run rustfmt on the project and add it to the CI pipeline (PR: https://github.com/rust-ndarray/ndarray/pull/608)~
  • [x] ~Run clippy on the project and take care of the linter warnings (PR: https://github.com/rust-ndarray/ndarray/pull/642)~

Other

  • [ ] Add in-place variants of dimension-changing operations for dynamic-dimensional arrays (Issue: https://github.com/rust-ndarray/ndarray/issues/428)
  • [x] Support Clone elements in stack and select (Issue: #269)

Improvements

Documentation

  • [ ] Add a new example to ndarray-examples
  • [x] Provide more details on Axis NewType pattern rationale (Issue: https://github.com/rust-ndarray/ndarray/issues/564)
  • [ ] Document ndarray's equivalent to NumPy's astype (Issues: https://github.com/rust-ndarray/ndarray/issues/493 , https://github.com/rust-ndarray/ndarray/issues/525)
  • [ ] Improve doc examples for Zip/azip with failing examples (Issue: https://github.com/rust-ndarray/ndarray/issues/453)

Error messages / Debugging

  • [ ] Better messages for incompatible shapes errors (Issue: https://github.com/rust-ndarray/ndarray/issues/449).
  • [x] ~Better formatting with Debug for arrays (Issue: https://github.com/rust-ndarray/ndarray/issues/398, PR: https://github.com/rust-ndarray/ndarray/pull/606)~

Sharp API edges/corner cases

  • [x] ~Avoid panicking for zero-length axis in map_axis/map_axis_mut (Issue: https://github.com/rust-ndarray/ndarray/issues/579)~
  • [ ] Refactor all dimension-related traits (Issues: https://github.com/rust-ndarray/ndarray/issues/519 https://github.com/rust-ndarray/ndarray/issues/367)

Core

  • [x] ~Change ArrayBase.ptr to NonNull type (Issue: https://github.com/rust-ndarray/ndarray/issues/434)(Ongoing PR: #683)~
  • [ ] Provide more direct mutable access to shape, strides, and owned data (Issues: https://github.com/rust-ndarray/ndarray/issues/429 https://github.com/rust-ndarray/ndarray/issues/592)

Performance

  • [ ] Have a look at sum_3_azip (Issue: https://github.com/rust-ndarray/ndarray/issues/561)
  • [ ] Faster, arbitrary-order iterators (Issue: https://github.com/rust-ndarray/ndarray/issues/469)
  • [ ] Co-broadcasting/two-sided broadcasting performance fixes #936

LukeMathWalker avatar Mar 17 '19 21:03 LukeMathWalker

Going through all of these issues, I have starting to think at broader challenges which should probably fall under ndarray's umbrella or are relevant to the project:

  • masked arrays
  • zero-cost interop with other scientific stacks using the Apache Arrow project
  • numpy.einsum equivalent
  • consolidating all currently maintained and mature ndarray-* crates into the rust-ndarray organization, harmonizing interfaces and integrating docs where appropriate

LukeMathWalker avatar Mar 17 '19 22:03 LukeMathWalker

I've started taking a crack at einsum here. The implementation I have there has multiple issues (performance and otherwise) and is not at all ready for production, but is apparently correct. I'm actively working on improving the implementation. There's a web frontend that uses the crate as a WASM module deployed here.

oracleofnj avatar May 01 '19 19:05 oracleofnj

The front-end is what I dreamed I could have when I started to use np.einsum back in the days - quite cool @oracleofnj! Parsing the output correctly is definitely the first step there - then it comes down to properly optimizing the computation path based on the inputs and the specified contractions. What is your attack plan @oracleofnj?

LukeMathWalker avatar May 02 '19 19:05 LukeMathWalker

After reading through the implementations/documentation in numpy and opt_einsum, I'm writing the base cases to handle a single operand or a pair of operands and then I'll write a function that takes the general case along with a pre-specified path and iterates along the path using the base cases. Last will come an independent function (or functions) to optimize the path given the operand sizes.

oracleofnj avatar May 02 '19 20:05 oracleofnj

I published a beta version of my crate to crates.io. It still has some issues but it's far enough along that you are welcome to give it a spin. There is a minimal example (and more in the tests/benches) at the crate repo where you should feel free to open any issues - we can move the discussion there.

oracleofnj avatar May 24 '19 20:05 oracleofnj

Just came across some missing functionality that might want to be tracked here: https://github.com/rust-ndarray/ndarray/issues/865 Equivalent numpy feature: slicing on a variable number of indices

TheButlah avatar Dec 19 '20 01:12 TheButlah