Lukasz Stafiniak
Lukasz Stafiniak
That would match CUDA terminology and would be less confusing.
Implementing #210 will produce such repetitions.
That's as informative as including it in the inline-declared tensors.
The big blocker is support for half-precision and precision conversions in ocaml-gccjit. Plus a smaller issue to have API support for taking a bigarray pointer without using `Ctypes.bigarray_start`, because `ctypes`...
A new system for deciding which arrays should be shared across virtual devices.
In particular, this will make it explicitly controllable and more visible. Currently, the classification is implicitly implemented in the CUDA backend.
https://github.com/NousResearch/DisTrO
Currently run-time log level hiding only saves on the conversion to PrintBox.
### `ppx_minidebug.2.0.2` Debug logs for selected functions and let-bindings Formatted logs of let-bound values, function arguments and results; `if` and `match` branches taken. Optionally, as collapsible HTML trees with highlights....
The HTML backend falls short as seen on https://github.com/c-cube/printbox/tree/main/src/printbox-ext-plot