Output shapes of Normal.pdf are inconsistent (haven't checked other RVs), see this 2d example:

>>> n = Normal(np.zeros((2,)), np.eye(2))
>>> x = np.random.randn(2,)
>>> n.pdf(x)
0.05358174006777249
>>> x = np.random.randn(2,1)
>>> n.pdf(x)
array([0.04105293, 0.11377797])
>>> x = np.random.randn(1,2)
>>> n.pdf(x)
0.11294177981074723
>>> x = np.random.randn(3,2)
>>> n.pdf(x)
array([0.12783197, 0.02004086, 0.127179  ])

The 2nd one should raise an error (it does if the second axis of x > 1).

Talking about shapes, there's another problem with 1d RVs:

>>> n = Normal(np.zeros((1,)), np.eye(1))
>>> x = np.random.randn(10,1)
>>> n.pdf(x).shape
(10,)
>>> x = np.random.randn(1,1)
>>> n.pdf(x).shape
()

but:

>>> n = Normal(0,1)
>>> x = np.random.randn(10,1)
>>> n.pdf(x).shape
(10, 1)
>>> x = np.random.randn(1,1)
>>> n.pdf(x).shape
(1, 1)

These should be the same, no matter how n was defined. I'm in favor of squeezing axis, but for x it might make sense sometimes to keep them around given they mean something (num_data, input_dim)

Mar 23 '21 15:03 alpiges

Oh damn, let's see if I understand the problem:

>>> n = Normal(np.zeros((2,)), np.eye(2))
>>> x = np.random.randn(2,)
>>> n.pdf(x)
0.05358174006777249

This is intended behavior.

>>> x = np.random.randn(2,1)
>>> n.pdf(x)
array([0.04105293, 0.11377797])

I agree, this should raise an error.

>>> x = np.random.randn(1,2)
>>> n.pdf(x)
0.11294177981074723

This should have shape (1,)

>>> x = np.random.randn(3,2)
>>> n.pdf(x)
array([0.12783197, 0.02004086, 0.127179  ])

This is intended behavior.

Talking about shapes, there's another problem with 1d RVs:
>>> n = Normal(np.zeros((1,)), np.eye(1))
>>> x = np.random.randn(10,1)
>>> n.pdf(x).shape
(10,)

This is intended behavior.

>>> x = np.random.randn(1,1)
>>> n.pdf(x).shape
()

Here we should not squeeze, i.e. the output shape should be (1,).

but:

>>> n = Normal(0,1)
>>> x = np.random.randn(10,1)
>>> n.pdf(x).shape
(10, 1)
>>> x = np.random.randn(1,1)
>>> n.pdf(x).shape
(1, 1)

These should be the same, no matter how n was defined.

These two are also intended behavior, since the shape of the random variable is actually () as opposed to (1,) in the example before. So I don't think they should be the same, since the shape of n should matter.

I'm in favor of squeezing axis, but for x it might make sense sometimes to keep them around given they mean something (num_data, input_dim)

I'm not sure if I understood you correctly, but I think we should not squeeze axis. It makes sense to distinguish between random variable shapes () and (1,), since we want to make random variables "equivalent" to numpy arrays.

However I do agree that, to resolve this issue, we should require the inputs to n.pdf to have shape batch_shape + rv_shape and output should have shape batch_shape.

This would mean that

>>> n = Normal(0,1)
>>> x = np.random.randn(10,1)
>>> n.pdf(x).shape
(10, 1)
>>> x = np.random.randn(1,1)
>>> n.pdf(x).shape
(1, 1)

and

>>> n = Normal(np.zeros((1,)), np.eye(1))
>>> x = np.random.randn(10,1)
>>> n.pdf(x).shape
(10,)
>>> x = np.random.randn(1,1)
>>> n.pdf(x).shape
(1,)

What do you think @alpiges @JonathanWenger?

Mar 27 '21 09:03 marvinpfoertner

I agree with your assessment @marvinpfoertner. This behaviour is particularly nasty

>>> x = np.random.randn(2,1)
>>> n.pdf(x)
array([0.04105293, 0.11377797])

I guess there is some implicit NumPy broadcasting that's going on.

Mar 28 '21 06:03 JonathanWenger

>>> x = np.random.randn(2,1)
>>> n.pdf(x)
array([0.04105293, 0.11377797])

True, this is a broadcasting bug and needs to be fixed. All the others are nitpicky (because basically inconsequential) shape inconsistencies (or maybe not even that).

Re

>>> n = Normal(0,1)
>>> x = np.random.randn(10,1)
>>> n.pdf(x).shape
(10, 1)
>>> x = np.random.randn(1,1)
>>> n.pdf(x).shape
(1, 1)

I get your point, but considered the definition of n as a lazy way to define a one-dimensional rv. That means I would expect the last axis to be squeezed due to, effectively, a production over dimension. But true, that's ambiguous, because otherwise there is no way to tell if the user wants a batch (N1, N2) or (N, D). On the other hand, I never see why 1d should be treated differently than multi-D and then you'd expect inputs like (N1, N2, D) --- if at all.

Mar 30 '21 08:03 alpiges

```python
>>> x = np.random.randn(2,1)
>>> n.pdf(x)
array([0.04105293, 0.11377797])
True, this is a broadcasting bug and needs to be fixed.

Giving this some more thought: Maybe we even want to allow broadcasting here, in case one actually intends to evaluate in a point where all coordinates are equal. This would mean that this is not a bug that needs fixing, but rather a feature that needs documentation and testing.

On the other hand, I never see why 1d should be treated differently than multi-D and then you'd expect inputs like (N1, N2, D) --- if at all.

One good reason for keeping the 1-dimensions around is that (N, 1) + (N', N, K) broadcasts differently than (N,) + (N', N, K) if K = N.

Broadcasting is exactly the reason why I think squeezing user input is not a good idea. IMHO, we should try to stick to input shape = output shape.

Mar 30 '21 12:03 marvinpfoertner

The intended behavior for any methods (.pdf, .cdf, etc.) is that they take arguments of shape (n_0, n_1, ...., d_0, d_1, ...) where n_i determines the batches and d_i the dimensions of the space of the random variable. However, note that scalar random variables have shape (). In particular we obtain:

# Scalar
rv # shape=()
x = np.arange(5) # shape=(5,)
y = rv.pdf(x) # shape=(5,)


"""Different case:""""
x = np.randn([10, 1]) # shape=(10,1)
z = rv.pdf(x) # shape=(10, 1)

# Vector
rv # shape=(2,)
x = np.randn([5, 2]) # shape=(5,2)
y = rv.pdf(x) # shape=(5,)

# Matrix-variate
rv # shape=(4, 3)
x = np.randn([5, 6, 4, 3])
y = rv.pdf(x) # shape= (5, 6)

Jun 22 '21 12:06 JonathanWenger

To Do

[ ] Fix broadcasting bug as described above
[ ] Adjust docstring to include the information about input shape
[ ] Write doctests for all cases above to illustrate the behaviour

Jun 22 '21 12:06 JonathanWenger

Fix and standardize in- and output shape of random variable methods

To Do