pdoc Avoid expansion of typenames into long expressions, e.g., `numpy.typing.ArrayLike`

Packages often include typenames that expand to long Union[...] definitions. Examples include ArrayLike and DTypeLike from numpy.typing (see here)

For this sample program

from typing import Any

import numpy as np
import numpy.typing as npt

def f1(a: npt.ArrayLike, dtype: npt.DTypeLike) -> np.ndarray[Any, Any]:
  return np.asarray(a).astype(dtype)

the pdoc output looks awful:

def f1(
	array: Union[numpy._array_like._SupportsArray[numpy.dtype], numpy._nested_sequence._NestedSequence[numpy._array_like._SupportsArray[numpy.dtype]], bool, int, float, complex, str, bytes, numpy._nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]],
	dtype: Union[numpy.dtype[Any], NoneType, Type[Any], numpy._dtype_like._SupportsDType[numpy.dtype[Any]], str, Tuple[Any, int], Tuple[Any, Union[SupportsIndex, Sequence[SupportsIndex]]], List[Any], numpy._dtype_like._DTypeDict, Tuple[Any, Any]]
) -> numpy.ndarray[typing.Any, typing.Any]:

Is there any way to direct pdoc to show shorter typenames?

We cannot use typing.NewType or typing.TypeVar because the Union[...] is not subclassable.
Assigning the type to a local constant unfortunately does not help:

ArrayLike = npt.ArrayLike

Similarly, declaring a local TypeAlias also does not help:

import typing
ArrayLike: typing.TypeAlias = npt.ArrayLike

Even the ugly approach of deferring the type using a string does not help:

def f2(a: 'npt.ArrayLike', dtype: 'npt.DTypeLike') -> np.ndarray[Any, Any]: ...

Can you think of any other workaround or solution? Ideally, any local typename constant could remain unexpanded. (TypeAlias is not ideal because it only appears in Python 3.10.) Possibly look for pdoc metadata within typing.Annotated?

Jul 30 '22 05:07 hhoppe

Thanks for the very convincing example. I agree we should do better here, although it's not quite trivial as we heavily rely on dynamic instrumentation. Maybe it's worth to prototype some code that extracts the verbatim annotation from the AST and see how viable that is.

Jul 30 '22 09:07 mhils

Thanks for the quick response! I just did some research and found that others had the same issue with Sphinx / autodoc:

https://stackoverflow.com/a/67483317 mentions postponed evaluation of annotations. (I am already running Python 3.10 and also just tried including from __future__ import annotations but by itself this does not help; it requires some change in pdoc?) I like the Sphinx approach of a type_aliases dictionary.

Jul 30 '22 14:07 hhoppe

Basing this on postponed evaluations is an interesting trick - that way we don't need to do any AST shenanigans. It still requires some substantial changes to how we render function signature, but maybe we can make that work. The key part is that we currently evaluate all type hints here, before we format the signature. In the first step we need to retain the original string somewhere and then we need to figure out how we can use that as the link text. Not trivial. :)

Jul 30 '22 21:07 mhils

Upon further inspection, here are the options we have:

1a) Extend `inspect.Signature` to include annotation text (using postponed annotations), then apply some heuristics.

This approach promises the best results, but upon prototyping I noticed tons of edge cases and pitfalls. For example, we also want to consider a type annotation like npt.ArrayLike | None, but handling that properly means we need to implement manual parsing and reassembly for arbitrary type annotations. I'm afraid this goes beyond the time I have available for this project.

(draft: https://github.com/mitmproxy/pdoc/compare/main...mhils:pdoc:better-annotations-experiment)

1b) Use heuristics that don't depend on an understanding of the type annotation.

Similar to 1a) we could just check if the rendered type annotation exceeds a certain number of characters and then fall back to whatever the literal text is. This will of course always be a tradeoff.

2) Hardcode popular edge cases

A bit less ambitious, we can just hardcode a few common cases such as numpy.typing.ArrayLike. I've prototyped this in https://github.com/mitmproxy/pdoc/compare/main...mhils:pdoc:better-annotations-experiment-2 and the implementation is super straightforward. The downside is that it won't work out of the box for your own custom codebase, but we can easily cover popular libraries such as numpy. If your own code has those massive annotations maybe you deserve it after all. 😛

@hhoppe, any thoughts?

Aug 02 '22 16:08 mhils

The code in (1a) is intricate and I don't have the context to understand it well; I can see that it would become complicated to parse and reassemble the type strings.

The code in (2) is very nice. (I hadn't seen the use of a lambda as a replacement --neat!) It's great to use formatannotation(DTypeLike) so as to be robust to changes in the third-party libraries. Would it be feasible to allow a command-line parameter (json Dict[str, str]?) to extend or override the replacements dictionary? It would be nice to adjust many things, e.g., whether one prefers typing.Any or plain Any.

Aug 02 '22 17:08 hhoppe

Would it be feasible to allow a command-line parameter (json Dict[str, str]?) to extend or override the replacements dictionary?

I'd generally like to keep the CLI surface as small as possible so that pdoc remains simple. I'm currently leaning towards this not crossing the bar, but I'll ponder on it for a bit.

In either case, a make.py like this will remain possible and supported:

from pdoc import pdoc, doc_types

doc_types.simplify_annotation.replacements["A"] = "B"
doc_types.simplify_annotation.recompile()

pdoc(...)

Aug 02 '22 18:08 mhils

The make.py approach sounds wonderful. It's also a nice place to specify many settings (like logo, favicon, etc.) rather than having an unwieldy command line.

Aug 02 '22 18:08 hhoppe

pdoc.render.configure is your friend then! 😃

Aug 02 '22 18:08 mhils

We may want simplify_annotation = _AnnotationReplacer() without the .__call__ so that we can later access the class instance. (I think the __call__ member will get called automatically in simplify_annotation().)

Aug 02 '22 22:08 hhoppe

Is there any reason to not do the desired behavior when a user is on a newer Python and is fine with using TypeAlias? The fact that the alias has that annotation should be inspectable dynamically.

Jun 26 '23 20:06 jgarvin

Are there any pending developments or interim proposed solutions for this item? It would be nice if there were a simple annotations flag of some sort which only returns the name (without path) of the annotation.

Oct 15 '23 09:10 songololo

I haven't tried with the type keyword, but using a TypeAliasType from typing-extensions in python 3.11 will result in its name being used without resolution to the underlying type. This is a little heavier and not always as simple as using a TypeAlias, but I've still got it working for me. Unfortunately, TypeAliasType types don't seem to show up in the docs making the substitution from the types shown in the docs to the actual types invisible. (Maybe wrong about that last bit, will confirm.)

Jan 31 '24 15:01 seankhl

A combination of from __future__ import annotations and the type statement should indeed do the trick. It's probably possible to extend this here:

https://github.com/mitmproxy/pdoc/blob/891605516a25c0f66dc184ce60754e76dbbe239e/pdoc/_compat.py#L25-L29

to support typing-extensions (try import for <3.12, fall back to current implementation if import does not work). Contributions are welcome!

Jan 31 '24 20:01 mhils

Avoid expansion of typenames into long expressions, e.g., `numpy.typing.ArrayLike`

1a) Extend inspect.Signature to include annotation text (using postponed annotations), then apply some heuristics.

1b) Use heuristics that don't depend on an understanding of the type annotation.

2) Hardcode popular edge cases

1a) Extend `inspect.Signature` to include annotation text (using postponed annotations), then apply some heuristics.