pytensor icon indicating copy to clipboard operation
pytensor copied to clipboard

Add numpy-like helper `hstack`

Open ricardoV94 opened this issue 1 year ago • 6 comments

Description

https://numpy.org/doc/stable/reference/generated/numpy.hstack.html

Others like vstack, column_stack and so one are probably missing as well. First step is to list all of them and confirm they are missing in pytensor.tensor

This should not require a new Op, just repurpose existing concatenate and sprinke expand_dims / atleast_nd as needed

ricardoV94 avatar Jan 11 '24 09:01 ricardoV94

I would like to work on this issue

HarshvirSandhu avatar Jan 31 '24 10:01 HarshvirSandhu

I would like to work on this issue

Thanks. Let us know if you have any questions

ricardoV94 avatar Jan 31 '24 11:01 ricardoV94

@ricardoV94 I noticed that functions such as pt.horizontal_stack and pt.vertical_stack are already present, will also implement column_stack and dstack

Also, just want to confirm, do arguments of these functions still need to have at least 2 dimensions? (please see below comments for reference). https://github.com/pymc-devs/pytensor/blob/082081ae5f864e622551a70fe164822b4bef064c/pytensor/tensor/basic.py#L2761-L2768

HarshvirSandhu avatar Feb 01 '24 16:02 HarshvirSandhu

That comment is +15 years old. Maybe numpy is less crazy these days? Unless there is some technical reason why we can't replicate numpy behavior (e.g., it depends on static shapes), we should stick to whatever numpy does exactly.

ricardoV94 avatar Feb 01 '24 18:02 ricardoV94

@ricardoV94 I observed unexpected behaviour while using np.column_stack with 1-D arrays.

Consider the below code:

a = np.array([0.1, 0.2, 0.3], dtype="float32") # 1D array
b = np.array([0.7, 0.8, 0.9], dtype="float32") # 1D array
print(np.column_stack([a,b]).shape) # prints (3, 2)

np.column_stack gives a shape of (3,2) whereas np.hstack gives the shape (6,) For 2D shapes it works the same as hstack

a = np.array([[0.1, 0.2, 0.3]], dtype="float32") # 2D array
b = np.array([[0.7, 0.8, 0.9]], dtype="float32") # 2D array
print(np.column_stack([av,bv]).shape) # prints (1, 6)

np.column_stack transposes 1D arrays before stacking them horizontally.

Should we still stick to what numpy does?

HarshvirSandhu avatar Feb 03 '24 11:02 HarshvirSandhu

Seems well documented so we should stick with what they do. The advantage of that is we offload design choices to numpy.

ricardoV94 avatar Feb 03 '24 12:02 ricardoV94