pytensor icon indicating copy to clipboard operation
pytensor copied to clipboard

Make exceptions less verbose by default

Open ricardoV94 opened this issue 8 months ago • 11 comments

PyTensor exceptions are very verbose by default. Users often struggle to even find the actual error message.

This PR makes exceptions more minimal with a hint to set the relevant flag exception_verbosity to medium (the old low or high) for more details.

The type error in the following snippet:

import pytensor
import pytensor.tensor as pt

x = pt.matrix("x")
y = pt.specify_shape(x, (2, 3))
fn = pytensor.function([x], y, mode="FAST_RUN")

fn([1])

Now looks like:

Traceback (most recent call last):
  File "/home/ricardo/miniforge3/envs/pytensor-dev/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-6c9f1f34f2ad>", line 8, in <module>
    fn([1])
  File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 960, in __call__
    arg_container.storage[0] = arg_container.type.filter(
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ricardo/Documents/pytensor/pytensor/tensor/type.py", line 254, in filter
    raise TypeError(
TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).
Invalid argument to PyTensor function at index 0.

Whereas with the old default looked like

Traceback (most recent call last):
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-6c9f1f34f2ad>", line 8, in <module>
    fn([1])
  File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 944, in __call__
    arg_container.storage[0] = arg_container.type.filter(
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ricardo/Documents/pytensor/pytensor/tensor/type.py", line 242, in filter
    raise TypeError(
TypeError: Bad input argument to pytensor function with name "<ipython-input-3-6c9f1f34f2ad>:6" at index 0 (0-based).  

Backtrace when that variable is created:
  File "/app/extra/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_ipython_console_011.py", line 438, in add_exec
    res = self.ipython.run_cell(line, store_history=True)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3075, in run_cell
    result = self._run_cell(
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3130, in _run_cell
    result = runner(coro)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
    coro.send(None)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3334, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3517, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-6c9f1f34f2ad>", line 4, in <module>
    x = pt.matrix("x")
Wrong number of dimensions: expected 2, got 1 with shape (1,).

A runtime error:

fn([[1]])

Now looks like this

Traceback (most recent call last):
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-df63f13088e4>", line 8, in <module>
    fn([[1]])
  File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1040, in __call__
    raise_with_op(
  File "/home/ricardo/Documents/pytensor/pytensor/link/utils.py", line 320, in raise_with_op
    raise exc_value.with_traceback(exc_trace)
  File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1030, in __call__
    outputs = vm() if output_subset is None else vm(output_subset=output_subset)
              ^^^^
AssertionError: SpecifyShape: dim 0 of input has shape 1, expected 2.

HINT: Set PyTensor `config.exception_verbosity` to `medium` or `high` for more information about the source of the error.

Whereas before it looked like

Traceback (most recent call last):
  File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1037, in __call__
    outputs = vm() if output_subset is None else vm(output_subset=output_subset)
              ^^^^
AssertionError: SpecifyShape: dim 0 of input has shape 1, expected 2.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-df63f13088e4>", line 8, in <module>
    fn([[1]])
  File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1047, in __call__
    raise_with_op(
  File "/home/ricardo/Documents/pytensor/pytensor/link/utils.py", line 526, in raise_with_op
    raise exc_value.with_traceback(exc_trace)
  File "/home/ricardo/Documents/pytensor/pytensor/compile/function/types.py", line 1037, in __call__
    outputs = vm() if output_subset is None else vm(output_subset=output_subset)
              ^^^^
AssertionError: SpecifyShape: dim 0 of input has shape 1, expected 2.
Apply node that caused the error: SpecifyShape(x, 2, 3)
Toposort index: 0
Inputs types: [TensorType(float64, shape=(None, None)), TensorType(int8, shape=()), TensorType(int8, shape=())]
Inputs shapes: [(1, 1), (), ()]
Inputs strides: [(8, 8), (), ()]
Inputs values: [array([[1.]]), array(2, dtype=int8), array(3, dtype=int8)]
Outputs clients: [[DeepCopyOp(SpecifyShape.0)]]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "/app/extra/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_ipython_console_011.py", line 438, in add_exec
    res = self.ipython.run_cell(line, store_history=True)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3075, in run_cell
    result = self._run_cell(
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3130, in _run_cell
    result = runner(coro)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
    coro.send(None)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3334, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3517, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-df63f13088e4>", line 5, in <module>
    y = pt.specify_shape(x, (2, 3))

HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

📚 Documentation preview 📚: https://pytensor--1330.org.readthedocs.build/en/1330/

ricardoV94 avatar Mar 30 '25 20:03 ricardoV94

The add_note feature was only added in 3.11, so I would wait

ricardoV94 avatar Mar 31 '25 09:03 ricardoV94

Failing jax test addressed in #1646

ricardoV94 avatar Oct 09 '25 12:10 ricardoV94

How do errors in numba mode look after this PR?

jessegrabowski avatar Oct 10 '25 13:10 jessegrabowski

One thing that's still frustrating is that the traceback is not useful at all. Adding the information in the error message about which input is causing the error is great, but seeing the line arg_container.storage[0] = arg_container.type.filter( tells the user nothing. I guess there's nothing that can be done there?

jessegrabowski avatar Oct 10 '25 13:10 jessegrabowski

I approved but I actually think the runtime error is a step backwards. You cut away the part of the traceback that points to the actionable python code, this one:

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "/app/extra/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_ipython_console_011.py", line 438, in add_exec
    res = self.ipython.run_cell(line, store_history=True)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3075, in run_cell
    result = self._run_cell(
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3130, in _run_cell
    result = runner(coro)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner
    coro.send(None)
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3334, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3517, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/home/ricardo/miniforge3/envs/pytensor/lib/python3.12/site-packages/IPython/core/interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-df63f13088e4>", line 5, in <module>
    y = pt.specify_shape(x, (2, 3))

jessegrabowski avatar Oct 10 '25 13:10 jessegrabowski

I approved but I actually think the runtime error is a step backwards. You cut away the part of the traceback that points to the actionable python code, this one:

I disagree.

First, 100% of PyTensor users are not generating their own functions, pymc and other frameworks are. You define an RV and then you get an obscure error because some join inputs created in the Model.logp_dlogp_function has a weird value in an operation you never called (the ones in the density).

Second, this example is short on purpose. In practice you have a traceback that is 100 lines long. Users just give up before finding the actual error raised (SpecifyShape failed)

Third, the info on how to get more details is right there. Change the config flag. If you need (which most times you don't, it's either immediately obviously or you would never be able to act on it) you can. Myself I would always prefer the 2 step workflow. I already know the error, now let's find where it's coming from.

The order / relevance of info is flipped in the old approach, the way it was presented.

ricardoV94 avatar Oct 11 '25 09:10 ricardoV94

How do errors in numba mode look after this PR?

Numba compile errors are completely unreadable but that's pretty much out of our control. Runtime errors like the SpecifyShape should look better, I can get an example output

ricardoV94 avatar Oct 11 '25 09:10 ricardoV94

One thing that's still frustrating is that the traceback is not useful at all.

TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).

It's pretty obvious? The function that raises it is obscure, but you should read tracebacks end to top, so you start close to the useful info. This PR tries to make it easier to locate the useful info by removing "helpful" cruft that was being appended.

We shouldn't hide the natural Python traceback though. That would be a big anti-pattern.

ricardoV94 avatar Oct 11 '25 09:10 ricardoV94

Here is a motivating example: https://discourse.pymc.io/t/debug-mode-in-pytensor/14348/12?u=ricardov94

Note the user defined line (the one they're responsible for) shows up not at the end but 4-5 stacks before. Before that there's a lot of text, appended after the actual error.

In an ideal world you would show the line the user wrote and the final error (in that order).

But we can't know what's the "line the user wrote".

That example is still more direct than what you would get in logp evals as I mentioned above but it's already terrible enough that the user had no clue where to start

ricardoV94 avatar Oct 11 '25 10:10 ricardoV94

I disagree.

Ok this is all convincing. Having the flag is helpful. I recognize the point that pytensor is meant to be a low-level language for developers, and an end user often doesn't even know she's using it, so the tracebacks can't always find "the line".

That said, on this:

TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).

It's pretty obvious?

No, it's not obvious at all. But if we have the additional flags to go deeper, that's great. The more information/hints we can give about where the actual error is in the computational graph, the better. I think at high levels of error verbosity, it would be great to show the dprint of the graph with a <----- or ^^^^^ marker showing which Op is causing the error.

But that's 100% out of scope for this PR, which I agree is a step forward from our current awful tracebacks.

Also, on this:

We shouldn't hide the natural Python traceback though. That would be a big anti-pattern.

The problem is that we're not necessarily even in python code when the error is raised. Every runtime error will be triggered on a call to vm(), which tells the user nothing -- especially the hypothetical user who doesn't even know he's running pytensor at all.

jessegrabowski avatar Oct 11 '25 20:10 jessegrabowski

The error is about the input to the function, it's always python land. It expected a matrix but received a vector. I don't know how we can be any more clear.

TypeError: Wrong number of dimensions: expected 2, got 1 with shape (1,).

Invalid argument to PyTensor function at index 0.

I'll add the name if available

ricardoV94 avatar Oct 12 '25 05:10 ricardoV94