rich Register custom __repr__ functions outside of their classes

TLDR: Overriding __repr__ of Python types is not always possible (and if it does, may change library behavior unexpectedly), so it would be valuable if rich provided a way to set custom formatters. For example, using a custom __repr__ for Numpy arrays is currently impossible and would become possible with this feature.

How would you improve Rich?

Add a function to register custom representation functions with rich, e.g.:

rich.pretty.set_custom_formatter(condition, formatter)

Example usage:

import rich
import rich.pretty

def array_formatter(x):
  if x.size <= 100:
    return repr(x)
  return f'{type(x)}(shape={x.shape} dtype={x.dtype})'

rich.pretty.set_custom_formatter(
    lambda x: hasattr(x, 'shape') and hasattr(x, 'dtype'),
    array_formatter)

import numpy as np
import jax.numpy as jnp

obj = {'foo': np.zeros((10, 10, 10)), 'bar': jnp.ones((1000,))}
rich.pretty.pprint(obj)
# {
#    'foo': np.ndarray(shape=(10, 10, 10), dtype=float64),
#    'bar': jax.Array(shape=(1000,), dtype=float32),
# }

Or equivalently:

@functools.partial(rich.pretty.set_custom_formatter, condition)
def formatter(x):
  ...

What problem does it solve for you?

We're using rich.traceback(show_locals=True) in a large code base that makes heavy use of Numpy and JAX. While it's generally very helpful, it frequently prints arrays that take up a whole screen height in the stack trace. As a result, we find ourselves repeatedly commenting the rich traceback hook in and out.

For our own Python classes, we could just implement __repr__ or __rich_repr__. However, for Python objects of external dependencies, we'd have to monkey-patch these, which poses the rich of changing behavior in unexpected ways. More importantly, the methods cannot be overridden for np.ndarray or other types implemented in C unless the library provides a mechanism for that (which Numpy doesn't).

Moreover, monkey-patching a solution in rich.pretty from the outside is difficult, because I believe the function that needs to be changed is to_repr() inside traverse() in rich/pretty.py, and changing a nested function from the outside does not work (because it is redefined each time the outer function runs).

The suggested feature would enable users to adjust the tracebacks (and pretty printing in general) to their needs, including formatting of Numpy arrays which is currently not possible, and without risk of affecting code behavior.

Jun 20 '24 15:06 danijar

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

Jun 20 '24 15:06 github-actions[bot]

Do you need any info from these specific locals? If so, then you can stop reading here.

But if not, there's already this mechanism for excluding certain locals from the traceback by name: https://github.com/Textualize/rich/blob/22c2cffd8e88181ad1162ca9098d190ec28c6996/rich/traceback.py#L436-L441 that could be extended with a new show_locals_predicate (or something) argument to traceback.install . (Alternatively widen the type of show_locals to bool | Callable)

# ... L440
if not show_locals_predicate(key, value):
    continue

That would feel pretty clean to me. This predicate could be of type

Callable[str, obj] -> bool

Which would even allow you to do something like

def predicate(name, ref):
    try:
        return ref.__module__ not in ('numpy', 'jax' )
    except AttributeError:
        return True

edit: Ofc it could also be a hide_locals_predicate, which might make more sense, now that I think about it.

Sep 08 '24 12:09 leogott