Gao, Xiang

Results 25 issues of Gao, Xiang

这个文件: https://github.com/zergtant/pytorch-handbook/blob/master/chapter1/1_tensor_tutorial.ipynb

```C++ #include #include #include __global__ void lowerbound(float inp_val) { constexpr int size = 6; float a[size] = {0.1, 0.2, 0.4, 0.6, 0.8, 1.}; auto result = thrust::lower_bound( thrust::device, a, a...

type: bug: functional
P1: should have
helps: pytorch
backend: CUDA

I am trying `cub::BlockRadixSort` with PyTorch, it is getting good performance, but I find it is hard to use: For example, if I want to sort 1023 elements, then I...

type: enhancement
P2: nice to have
helps: pytorch

Currently, `cub::DeviceSegmentedRadixSort` launches `num_segments` blocks and each block works on one segment. This approach does not have good performance when the number of segments is small: https://github.com/pytorch/pytorch/issues/63456. For small number...

type: enhancement
area: performance
P3: backlog

Currently, `cub::DeviceRadixSort` only support operating on pointers ```C++ template static CUB_RUNTIME_FUNCTION cudaError_t SortPairs (void *d_temp_storage, size_t &temp_storage_bytes, const KeyT *d_keys_in, KeyT *d_keys_out, const ValueT *d_values_in, ValueT *d_values_out, int num_items, int...

type: enhancement
P2: nice to have
helps: pytorch

Fixes https://github.com/NVIDIA/cccl/issues/868

P3: backlog
helps: pytorch

Because writing something like ```python atomic_energies.sum(dim='atoms') ``` is much more readable than ```python atomic_energies.sum(1) ```

- [x] Add 3D structures from NIST https://github.com/aiqm/torchani/pull/146 - [ ] Add more off-equilibrium structures, reactions - [x] Test structure optimization https://github.com/aiqm/torchani/pull/153