Make exceptions less verbose by default
PyTensor exceptions are very verbose by default. Users often struggle to even find the actual error message.
This PR makes exceptions more minimal with a hint to set the relevant flag exception_verbosity to medium (the old low or high) for more details.
The type error in the following snippet:
import pytensor
import pytensor.tensor as pt
x = pt.matrix("x")
y = pt.specify_shape(x, (2, 3))
fn = pytensor.function([x], y, mode="FAST_RUN")
fn([1])
Now looks like:
Traceback (most recent call last):
File "/home/ricardo/miniforge3/envs/pytensor-dev/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-6c9f1f34f2ad>", line 8, in <module>
fn([1])
File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 960, in __call__
arg_container.storage[0] = arg_container.type.filter(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ricardo/Documents/pytensor/pytensor/tensor/type.py", line 254, in filter
raise TypeError(
TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).
Invalid argument to PyTensor function at index 0.
Whereas with the old default looked like
Traceback (most recent call last):
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-6c9f1f34f2ad>", line 8, in <module>
fn([1])
File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 944, in __call__
arg_container.storage[0] = arg_container.type.filter(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ricardo/Documents/pytensor/pytensor/tensor/type.py", line 242, in filter
raise TypeError(
TypeError: Bad input argument to pytensor function with name "<ipython-input-3-6c9f1f34f2ad>:6" at index 0 (0-based).
Backtrace when that variable is created:
File "/app/extra/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_ipython_console_011.py", line 438, in add_exec
res = self.ipython.run_cell(line, store_history=True)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3075, in run_cell
result = self._run_cell(
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3130, in _run_cell
result = runner(coro)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
coro.send(None)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3334, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3517, in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-6c9f1f34f2ad>", line 4, in <module>
x = pt.matrix("x")
Wrong number of dimensions: expected 2, got 1 with shape (1,).
A runtime error:
fn([[1]])
Now looks like this
Traceback (most recent call last):
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-8-df63f13088e4>", line 8, in <module>
fn([[1]])
File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1040, in __call__
raise_with_op(
File "/home/ricardo/Documents/pytensor/pytensor/link/utils.py", line 320, in raise_with_op
raise exc_value.with_traceback(exc_trace)
File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1030, in __call__
outputs = vm() if output_subset is None else vm(output_subset=output_subset)
^^^^
AssertionError: SpecifyShape: dim 0 of input has shape 1, expected 2.
HINT: Set PyTensor `config.exception_verbosity` to `medium` or `high` for more information about the source of the error.
Whereas before it looked like
Traceback (most recent call last):
File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1037, in __call__
outputs = vm() if output_subset is None else vm(output_subset=output_subset)
^^^^
AssertionError: SpecifyShape: dim 0 of input has shape 1, expected 2.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-df63f13088e4>", line 8, in <module>
fn([[1]])
File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1047, in __call__
raise_with_op(
File "/home/ricardo/Documents/pytensor/pytensor/link/utils.py", line 526, in raise_with_op
raise exc_value.with_traceback(exc_trace)
File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1037, in __call__
outputs = vm() if output_subset is None else vm(output_subset=output_subset)
^^^^
AssertionError: SpecifyShape: dim 0 of input has shape 1, expected 2.
Apply node that caused the error: SpecifyShape(x, 2, 3)
Toposort index: 0
Inputs types: [TensorType(float64, shape=(None, None)), TensorType(int8, shape=()), TensorType(int8, shape=())]
Inputs shapes: [(1, 1), (), ()]
Inputs strides: [(8, 8), (), ()]
Inputs values: [array([[1.]]), array(2, dtype=int8), array(3, dtype=int8)]
Outputs clients: [[DeepCopyOp(SpecifyShape.0)]]
Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
File "/app/extra/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_ipython_console_011.py", line 438, in add_exec
res = self.ipython.run_cell(line, store_history=True)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3075, in run_cell
result = self._run_cell(
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3130, in _run_cell
result = runner(coro)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
coro.send(None)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3334, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3517, in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-df63f13088e4>", line 5, in <module>
y = pt.specify_shape(x, (2, 3))
HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
📚 Documentation preview 📚: https://pytensor--1330.org.readthedocs.build/en/1330/
The add_note feature was only added in 3.11, so I would wait
Failing jax test addressed in #1646
How do errors in numba mode look after this PR?
One thing that's still frustrating is that the traceback is not useful at all. Adding the information in the error message about which input is causing the error is great, but seeing the line arg_container.storage[0] = arg_container.type.filter( tells the user nothing. I guess there's nothing that can be done there?
I approved but I actually think the runtime error is a step backwards. You cut away the part of the traceback that points to the actionable python code, this one:
Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
File "/app/extra/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_ipython_console_011.py", line 438, in add_exec
res = self.ipython.run_cell(line, store_history=True)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3075, in run_cell
result = self._run_cell(
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3130, in _run_cell
result = runner(coro)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
coro.send(None)
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3334, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3517, in run_ast_nodes
if await self.run_code(code, result, async_=asy):
File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-df63f13088e4>", line 5, in <module>
y = pt.specify_shape(x, (2, 3))
I approved but I actually think the runtime error is a step backwards. You cut away the part of the traceback that points to the actionable python code, this one:
I disagree.
First, 100% of PyTensor users are not generating their own functions, pymc and other frameworks are. You define an RV and then you get an obscure error because some join inputs created in the Model.logp_dlogp_function has a weird value in an operation you never called (the ones in the density).
Second, this example is short on purpose. In practice you have a traceback that is 100 lines long. Users just give up before finding the actual error raised (SpecifyShape failed)
Third, the info on how to get more details is right there. Change the config flag. If you need (which most times you don't, it's either immediately obviously or you would never be able to act on it) you can. Myself I would always prefer the 2 step workflow. I already know the error, now let's find where it's coming from.
The order / relevance of info is flipped in the old approach, the way it was presented.
How do errors in numba mode look after this PR?
Numba compile errors are completely unreadable but that's pretty much out of our control. Runtime errors like the SpecifyShape should look better, I can get an example output
One thing that's still frustrating is that the traceback is not useful at all.
TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).
It's pretty obvious? The function that raises it is obscure, but you should read tracebacks end to top, so you start close to the useful info. This PR tries to make it easier to locate the useful info by removing "helpful" cruft that was being appended.
We shouldn't hide the natural Python traceback though. That would be a big anti-pattern.
Here is a motivating example: https://discourse.pymc.io/t/debug-mode-in-pytensor/14348/12?u=ricardov94
Note the user defined line (the one they're responsible for) shows up not at the end but 4-5 stacks before. Before that there's a lot of text, appended after the actual error.
In an ideal world you would show the line the user wrote and the final error (in that order).
But we can't know what's the "line the user wrote".
That example is still more direct than what you would get in logp evals as I mentioned above but it's already terrible enough that the user had no clue where to start
I disagree.
Ok this is all convincing. Having the flag is helpful. I recognize the point that pytensor is meant to be a low-level language for developers, and an end user often doesn't even know she's using it, so the tracebacks can't always find "the line".
That said, on this:
TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).
It's pretty obvious?
No, it's not obvious at all. But if we have the additional flags to go deeper, that's great. The more information/hints we can give about where the actual error is in the computational graph, the better. I think at high levels of error verbosity, it would be great to show the dprint of the graph with a <----- or ^^^^^ marker showing which Op is causing the error.
But that's 100% out of scope for this PR, which I agree is a step forward from our current awful tracebacks.
Also, on this:
We shouldn't hide the natural Python traceback though. That would be a big anti-pattern.
The problem is that we're not necessarily even in python code when the error is raised. Every runtime error will be triggered on a call to vm(), which tells the user nothing -- especially the hypothetical user who doesn't even know he's running pytensor at all.
The error is about the input to the function, it's always python land. It expected a matrix but received a vector. I don't know how we can be any more clear.
TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).
Invalid argument to PyTensor function at index 0.
I'll add the name if available