Results 42 issues of Bryce Allen

The zero length sarray implementation has an `operator[]` which cannot return - it's implementation is a gtGpuAssert false which throws an exception on host and aborts on device. The compiler...

With the new default backend (not using thrust), it's not easy to print a single value or small range of values out of a device gtensor object. Possible APIs: -...

enhancement
good first issue

Why is `is_device_accessible` for example in `gt::backend::clib`, rather than just `gt::backend`?

nvcc build c_test_fortran.cxx example just fine, but hipcc from ROCm 5.3.0 does not find the header, which is actually provided by gfortran. This may require hacking include paths to workaround.

GENE supports 64 bit and 32 bit real values (128/64 complex). The current gt Fortran lib just uses compiler defaults. Should gtensor have an option for this, or just support...

Existing use of `gpublas_set_stream` in GENE passes nullptr to reset to default stream behavior. The stream refactor broke this for SYCL backend, where that has to be special cased to...

The fact that std::complex works at all in device code is specific to Intel's SYCL implementaiton, and not part of standard SYCL. Standardizing it is in process, with a header...

The C layer for streams in gt-blas was original designed to be a substitute for existing Fortran code calling cublas. As the API evolved to be more C++ and more...

Currently gtensor uses `int` based `shape_type`, while low level indexing like `gtensor_storage` index operator are `gt::size_type`, which is an alias for `std::size_t` (typically unsigned long). Furthermore, the `calc_size(shape)` helper returns...

We already support one multi-d reduction, `sum_axis_to`, which is implemented in a way that requires lots of segments to perform well (there is one thread per output array element, and...