Bryce Allen issues

Results 42 issues of


                                            Bryce Allen

missing return statement warning

The zero length sarray implementation has an `operator[]` which cannot return - it's implementation is a gtGpuAssert false which throws an exception on host and aborts on device. The compiler...

device debug print helper

With the new default backend (not using thrust), it's not easy to print a single value or small range of values out of a device gtensor object. Possible APIs: -...

enhancement

good first issue

clib namespace is confusing

Why is `is_device_accessible` for example in `gt::backend::clib`, rather than just `gt::backend`?

fortran: rocm does not find ISO_Fortran_binding.h header

nvcc build c_test_fortran.cxx example just fine, but hipcc from ROCm 5.3.0 does not find the header, which is actually provided by gfortran. This may require hacking include paths to workaround.

fortran: add cmake option for complex/real sizes

GENE supports 64 bit and 32 bit real values (128/64 complex). The current gt Fortran lib just uses compiler defaults. Should gtensor have an option for this, or just support...

cgtblas: sycl backend does not handle nullptr case

Existing use of `gpublas_set_stream` in GENE passes nullptr to reset to default stream behavior. The stream refactor broke this for SYCL backend, where that has to be special cased to...

sycl: use complex extension

The fact that std::complex works at all in device code is specific to Intel's SYCL implementaiton, and not part of standard SYCL. Standardizing it is in process, with a header...

improve C API for streams

The C layer for streams in gt-blas was original designed to be a substitute for existing Fortran code calling cublas. As the API evolved to be more C++ and more...

consistent size and index types

Currently gtensor uses `int` based `shape_type`, while low level indexing like `gtensor_storage` index operator are `gt::size_type`, which is an alias for `std::size_t` (typically unsigned long). Furthermore, the `calc_size(shape)` helper returns...

add multi-d reductions

We already support one multi-d reduction, `sum_axis_to`, which is implemented in a way that requires lots of segments to perform well (there is one thread per output array element, and...