aesara icon indicating copy to clipboard operation
aesara copied to clipboard

Fix issues caused by inconsistent `TensorVariable.__iter__`

Open brandonwillard opened this issue 3 years ago • 1 comments

The class _tensor_py_operators—from which TensorVariables inherit—automatically provides an __iter__ method and this tricks Pandas (or any other library) into thinking that all TensorVariables are iterable (e.g. via isinstance(..., Iterable) and the like) when not all of them really are.

If you look at the implementation of __iter__, you'll see that it's ready to raise an exception when the TensorVariable isn't a vector (see theano.tensor.basic.get_vector_length). This is a misuse of typing, because it's advertising an interface that it doesn't necessarily provide, and determining when it can provide that interface isn't even all that difficult (e.g. at the very least, it requires TensorVariable.type.ndim == 1).

In other words, this design needs to be fixed. One simple approach is to create a VectorVariable subclass of TensorVariable with the __iter__ interface, then a factory could be used to instantiate those when type.ndim == 1. I imagine that most of the work behind that change would involve finding and fixing any overly strict type checks (e.g. type(x) == TensorVariable).

Originally posted by @brandonwillard in https://github.com/pymc-devs/Theano-PyMC/issue_comments/705658478

brandonwillard avatar Oct 08 '20 15:10 brandonwillard

We could also remove the __iter__ interface from TensorVariable and make Applys and/or Ops create instances of a special TensorVariable subclass supporting __iter__ when static shape information is available.

brandonwillard avatar May 15 '22 18:05 brandonwillard