stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

[RFC]: achieve ndarray API parity with built-in JavaScript arrays

Open kgryte opened this issue 1 year ago • 31 comments

Description

This RFC proposes achieving ndarray API parity with built-in JavaScript arrays. Built-in JavaScript Array and TypedArray objects have a number of methods for searching, manipulating, sorting, and transforming array data.

The goal of this RFC is to add functional APIs providing equivalent functionality for ndarrays. By providing these APIs, stdlib can offer a powerful toolset using a similar vocabulary and interface design as existing art for working with ndarrays. This should help reduce the barrier to ndarray adoption and encourage their more widespread use.

Note, however, that ndarrays have considerable additional complexity due to their multi-dimensional nature, and, in particular, element-wise iteration requires specialized kernels for handling non-contiguous underlying data.

There does exist precedent in stdlib for such kernels (e.g., ndarray/base/assign, ndarray/base/nullary, and ndarray/base/unary). Those packages also provide C APIs which may or may not be relevant to the functional APIs proposed in this RFC.

What follows is an initial list of Array.prototype.* methods and notes regarding whether an equivalent already exists or what constraints we need to consider when designing ndarray equivalent packages.

Top-level: ndarray/*

  • [x] Array.prototype.at

    • ndarray/at
  • [x] Array.prototype.entries

    • ndarray/iter/entries
  • [x] Array.prototype.every

    • ndarray/every
  • [x] Array.prototype.forEach

    • ndarray/for-each
  • [x] Array.prototype.map

    • ndarray/map
  • [x] Array.prototype.filter

    • ndarray/filter
  • [x] Array.prototype.slice

    • ndarray/slice
  • [x] Array.prototype.keys

    • ndarray/iter/indices
  • [x] Array.prototype.values

    • ndarray/iter/values
  • [x] Array.prototype.includes

    • ndarray/includes
  • [x] Array.prototype.indexOf

    • blas/ext/index-of
  • [x] Array.prototype.find

    • ndarray/base/find
    • ndarray/find
  • [x] Array.prototype.findIndex

    • blas/ext/find-index
  • [x] Array.prototype.lastIndexOf

    • blas/ext/last-index-of
  • [x] Array.prototype.some

    • ndarray/any
  • [ ] Array.prototype.findLast(WIP)

    • reduction; specify one or more axes
  • [x] Array.prototype.findLastIndex

    • blas/ext/find-last-index
  • [x] Array.prototype.reverse

    • ndarray/reverse
  • [x] Array.prototype.toReversed

    • ndarray/base/to-reversed
    • ndarray/reversed
  • [x] Array.prototype.sort

    • multiple functions for different sorting algorithms.
    • blas/ext/sorthp.
  • [x] Array.prototype.toSorted

    • multiple functions for different sorting algorithms.
    • blas/ext/to-sortedhp
  • [x] Array.prototype.fill

    • ndarray/fill and ndarray/fill-by
    • ndarray/fill-slice
  • [x] Array.prototype.flat

    • ndarray/flatten
  • [ ] Array.prototype.reduce

    • reduction; specify one or more axes
  • [x] Array.prototype.concat

    • ndarray/concat
  • [ ] Array.prototype.copyWithin

    • specify axis
    • how would this work for strided arrays?
    • for ideal case, would prefer delegating to array prototype method, as likely faster; for non-ideal case, would need to take special care to avoid overwriting accessed values.
    • Slice + assign?
    • Temporary buffer for when there is overlap.
  • [ ] Array.prototype.reduceRight

    • reduction; specify one or more axes
    • iterate in reverse direction
  • [x] Array.prototype.flatMap

    • ndarray/flatten-by
  • [ ] Array.prototype.join

    • needs R&D
    • how would dimension separators work? a different separator for each dimension? configurable?
  • [x] Array.prototype.pop

    • ndarray/pop
  • [x] Array.prototype.push

    • ndarray/concat1d
  • [x] Array.prototype.shift

    • ndarray/shift
  • [ ] Array.prototype.splice

    • needs R&D
    • would involving slicing, concatenation, and data copy
  • [ ] Array.prototype.toLocaleString

    • needs R&D
  • [ ] Array.prototype.toSpliced

    • needs R&D
  • [ ] Array.prototype.toString

    • just call ndarray.toString()?
  • [ ] Array.prototype.unshift

    • this would essentially be concat along an axis
    • support for broadcasting?
    • would require a data copy, no mutation
  • [x] Array.prototype.with

    • ndarray/with

Base: ndarray/base/*

  • [x] Array.prototype.fill

    • ndarray/base/fill
  • [x] Array.prototype.forEach

    • ndarray/base/for-each
  • [x] Array.prototype.map

    • ndarray/base/map
  • [x] Array.prototype.reverse

    • ndarray/base/reverse
  • [x] Array.prototype.slice

    • ndarray/base/slice
  • [x] Array.prototype.toReversed

    • ndarray/base/to-reversed
  • [x] Array.prototype.every

    • ndarray/base/every-by
  • [x] Array.prototype.some

    • ndarray/base/any
  • [x] Array.prototype.find

    • ndarray/base/find
  • [x] Array.prototype.pop

    • ndarray/base/pop
  • [x] Array.prototype.shift

    • ndarray/base/shift

Related Issues

None.

Questions

No.

Other

No.

Checklist

  • [X] I have read and understood the Code of Conduct.
  • [X] Searched for existing issues and pull requests.
  • [X] The issue name begins with RFC:.

kgryte avatar Jul 24 '24 09:07 kgryte

cc @headlessNode

kgryte avatar Jul 24 '24 09:07 kgryte

I want to work on this issue @kgryte and could you elaborate on the specific goals of aligning ndarray APIs with JavaScript arrays? How will this benefit users and developers?

SarthakPaandey avatar Jul 26 '24 07:07 SarthakPaandey

How will this benefit users and developers?

This is already addressed in the OP. Second paragraph.

kgryte avatar Jul 26 '24 07:07 kgryte

Yup! Starting work on this!

SarthakPaandey avatar Jul 26 '24 07:07 SarthakPaandey

@SarthakPaandey Before you begin, it would be best to communicate what you're planning to work on. Some of the above routines are easier than others and should be addressed first. Reductions require R&D and should only be worked on once we've determined the best approach.

kgryte avatar Jul 26 '24 09:07 kgryte

"Understood, @kgryte. I'll prioritize the routines that are comparatively easier and communicate my plan before starting. I'll defer work on reductions until we've established the best approach. Thanks for the guidance!

SarthakPaandey avatar Jul 26 '24 09:07 SarthakPaandey

@kgryte I have started to work on the map method. I plan to submit PR for it when I've implemented the kernels for it up to 3d. This way, it would be possible to get feedback from you earlier and it would be easier for me to work on the feedback. Does that seem reasonable?

headlessNode avatar Jul 26 '24 10:07 headlessNode

@headlessNode That makes sense. I am currently working on for-each, which is related.

kgryte avatar Jul 26 '24 10:07 kgryte

@headlessNode I suggest working a "base" implementation of map first. For @stdlib/ndarray/base/map, we can assume that an output array is provided, similar to ndarray/base/unary. In fact, unary is a good reference for map. The main difference being the support of a thisArg and no C implementation.

kgryte avatar Jul 26 '24 10:07 kgryte

@kgryte The thisArg in our case would be similar to how it is handled in the Array.prototype.map right? e.g.

//Array.prototype.map
let numbers = [1, 2, 3];
let obj = { multiplier: 2 };

numbers.map(function(num) {
  return num * this.multiplier;
}, obj);```

headlessNode avatar Jul 26 '24 10:07 headlessNode

@headlessNode Yes. As an example, see https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/array/base/any-by.

kgryte avatar Jul 26 '24 16:07 kgryte

@headlessNode Pushed up a POC implementation of for-each: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/ndarray/base/for-each

kgryte avatar Jul 27 '24 00:07 kgryte

Thanks!

headlessNode avatar Jul 27 '24 07:07 headlessNode

Heads up. I have started to work on ndarray/base/fill

headlessNode avatar Aug 20 '24 14:08 headlessNode

Sounds good, @headlessNode! Thanks for the heads up!

kgryte avatar Aug 20 '24 16:08 kgryte

Hey it looks like i joined the conversation late but is there some pending work which i can contribute on?

DhruvArvindSingh avatar Nov 18 '24 20:11 DhruvArvindSingh

@headlessNode Do you currently have anything in progress?

@DhruvArvindSingh If this is your first time contributing to stdlib, you may want to first knock out a few "Good First Issues" in order to get acquainted with the project. The ndarray packages are typically pretty involved.

kgryte avatar Nov 19 '24 08:11 kgryte

@kgryte yeah, working towards adding the top-level implementations. Should be ready for initial review in a few days.

headlessNode avatar Nov 19 '24 08:11 headlessNode

@kgryte, Are there any tasks or functionalities that still need to be implemented? Do you have any suggestions on how I can get started?

gururaj1512 avatar Dec 20 '24 19:12 gururaj1512

@gururaj1512 If you are interested in working on this, I suggest the two lowest hanging fruit are fill and with, as they can wrap base/assign, which is already implemented. We can skip adding the native add-on for now.

kgryte avatar Dec 30 '24 20:12 kgryte

@gururaj1512 And I suggest prioritizing with first, as fill will require optional slice support (e.g., fill( x, value[, MultiSlice])).

kgryte avatar Dec 30 '24 20:12 kgryte

I see that base/fill has been added (ref: https://github.com/stdlib-js/stdlib/blob/develop/lib/node_modules/%40stdlib/ndarray/base/fill/lib/main.js). However, we may want to revisit that package in order to add slice support.

kgryte avatar Dec 30 '24 20:12 kgryte

Got it, I'll on it..

gururaj1512 avatar Dec 30 '24 20:12 gururaj1512

IIRC, we may want separate packages for filling and assigning only selected axis of the input ndarray.

headlessNode avatar Dec 30 '24 20:12 headlessNode

@headlessNode Yeah, I don't recall the context or outcome of those prior discussions. 😔

kgryte avatar Dec 30 '24 21:12 kgryte

@kgryte We'd have slice-fill(value, start, end) where start = index to start filling from and end = index to end filling at. Achieving which would, I think, require setting-up the kernels.

So, I think, a base implementation of slice-fill and then in top-level fill implementation we could offer more control to the end-user?

headlessNode avatar Dec 30 '24 21:12 headlessNode

@headlessNode Yeah, I suppose we could add fill-slice; however, start and end don't make sense, IMO. Instead, we'd just require a user provide a MultiSlice slice argument, which would define the range of values to update. In terms of kernels, we already have base/slice-assign which should fulfill this purpose, as it wraps base/assign.

kgryte avatar Dec 30 '24 23:12 kgryte

Hey @kgryte, I am really interested on working on this issue, I have read the whole conversation here and I see that this issue is broken into chunks and tackled and assigned to the interested ones. May I know where can I get started related to this.

ShabiShett07 avatar Jan 04 '25 06:01 ShabiShett07

I am working on Array.prototype.every is it ok?

ShabiShett07 avatar Jan 06 '25 08:01 ShabiShett07

@ShabiShett07 This is already being worked on: https://github.com/stdlib-js/stdlib/pull/4356. And also, see the discussion at https://github.com/stdlib-js/stdlib/pull/4398#issuecomment-2565875736. Namely, before we work on reductions, we need to determine how to write the reduction kernels supporting an arbitrary subset of axes.

kgryte avatar Jan 06 '25 08:01 kgryte