stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

Ndarray with column-major ordering cuts across array-of-arrays input

Open rreusser opened this issue 4 years ago • 3 comments

Checklist

Please ensure the following tasks are completed before filing a bug report.

  • [x] Read and understood the Code of Conduct.
  • [x] Searched for existing issues and pull requests.

Description

Description of the issue.

The following code produces an array whose shape does not match what I'd expect:

x = stdlib.array([[1, 2], [3, 4], [5, 6]], { order: 'column-major' })
x.shape
// => [3, 2]

As a matrix, it looks like:

[ 1 4 ]
[ 2 5 ]
[ 3 6 ]

The resulting matrix seems to cut across the array-of-arrays input. If you simply interpret this as a 2 x 3 array instead of 3 x 2, then it matches what I'd expect. I suspect but haven't confirmed that reversing the order of the inferred shape may solve the issue (for 2D and higher dimensionalities). See example and demonstration that reversing may fix here

Related Issues

Does this issue have any related issues?

No.

Questions

Any questions for reviewers?

No.

Other

Any other information relevant to this issue? This may include screenshots, references, stack traces, sample output, and/or implementation notes.

Demo

If relevant, provide a link to a live demo.

For a live demo of the issue, see

  • https://observablehq.com/d/a7023e9c48cf703b

Reproduction

What steps are required to reproduce the unexpected output?

See above for code.

Expected Results

What are the expected results?

I expect the inner arrays of array-of-arrays input to show up in the constructed array as either a row or a column, not to be split and spread across multiple columns.

Actual Results

What are the actual results?

shape seems reversed in a manner that spreads inner arrays across multiple columns.

Environments

What environments are affected (e.g., Node v0.4.x, Chrome, IE 11)? If Node.js, include the npm version, operating system, and any other potentially relevant platform information.

The following environments are affected:

  • Usage in Chrome with stdlib-tree.

rreusser avatar Sep 21 '19 22:09 rreusser

@rreusser Thanks for filing this issue. Will look into producing a fix shortly.

kgryte avatar Sep 21 '19 23:09 kgryte

@rreusser Okay, so the current behavior is intentional. As documented in the package documentation, when you specify an order, you are saying what format the data is already in, not what order you want the data to be.

It is not clear, at least in my mind, what a user means when s/he provides a nested array and then declares that the data "order" is column-major, at least in the "descriptive" sense, as the inferred shape based on the provided argument is at odds with, as you show in your examples, the shape as might be expected for data provided as column-major.

Maybe it makes more sense to throw an error here, instead of trying to guess what a user intends?

kgryte avatar Sep 23 '19 22:09 kgryte

Based on Gitter discussion, consensus seems to be that the README documentation should include an example explicitly showing how order specification is descriptive, not prescriptive, and highlight potential "gotchas", such as specifying "column-major" does not mean that a provided nested array should be flattened to column-major order, but rather that the data is provided in column-major order already.

kgryte avatar Sep 23 '19 23:09 kgryte

Following up on this discussion, the behavior documented in this issue was, in fact, a bug, IMO. I've resolved this bug in https://github.com/stdlib-js/stdlib/commit/61348590284d468ea52a07c51a68219407588757.

With https://github.com/stdlib-js/stdlib/commit/61348590284d468ea52a07c51a68219407588757, the ndarray/array behavior matches NumPy flattening behavior. Namely, that, when provided a nested array, the function should flatten the nested arrays to match the specified storage order. Note that this differs from the OP of this issue where it was suggested to flip the dimensions. Instead, a set of nested arrays is presumed to match the conventional visual order (and lexicographic indexing conventions).

In NumPy,

In [1]: x = np.asarray( [[1,2], [3,4], [5,6]], order="C")

In [2]: x.shape
Out[2]: (3, 2)

In [3]: x[1,1]
Out[3]: 4

In [4]: x[0,1]
Out[4]: 2

In [5]: x[1,0]
Out[5]: 3

In [6]: x = np.asarray( [[1,2], [3,4], [5,6]], order="F")

In [7]: x.shape
Out[7]: (3, 2)

In [8]: x[1,1]
Out[8]: 4

In [9]: x[0,1]
Out[9]: 2

In [10]: x[1,0]
Out[10]: 3

Now, in stdlib with https://github.com/stdlib-js/stdlib/commit/61348590284d468ea52a07c51a68219407588757,

In [1]: var x = array( [[1,2], [3,4], [5,6]], { 'order': 'row-major' } );

In [2]: x.data
Out[2]: Float64Array(6) [ 1, 2, 3, 4, 5, 6 ]

In [3]: x.shape
Out[3]: [ 3, 2 ]

In [4]: x.get(1,1)
Out[4]: 4

In [5]: x.get(0,1)
Out[5]: 2

In [6]: x.get(1,0)
Out[6]: 3

In [7]: var x = array( [[1,2], [3,4], [5,6]], { 'order': 'column-major' } );

In [8]: x.data
Out[8]: Float64Array(6) [ 1, 3, 5, 2, 4, 6 ]

In [9]: x.shape
Out[9]: [ 3, 2 ]

In [10]: x.get(1,1)
Out[10]: 4

In [11]: x.get(0,1)
Out[11]: 2

In [12]: x.get(1,0)
Out[12]: 3

As this issue is currently resolved in the main branch and will be included in the next release, will go ahead and close this issue.

Thanks, @rreusser!

kgryte avatar Nov 14 '23 00:11 kgryte