Danial Javady

Results 16 issues of Danial Javady

**Describe the bug** It takes around 6-7 minutes before `./build.sh libcudf tests` will begin building targets. There is a lot of preparation work that is done. This can be unproductive...

bug
4 - Waiting on Author
Needs Triage

On my 3080 **BEFORE** line: `int n = npq_offset / (p_ * q_);` translates to [before_first_line_sass.txt](https://github.com/NVIDIA/cutlass/files/14826990/before_first_line_sass.txt) line: `int residual = npq_offset % (p_ * q_);` translates to [before_second_line_sass.txt](https://github.com/NVIDIA/cutlass/files/14826999/before_second_line_sass.txt) (i'll omit...

inactive-30d
inactive-90d

https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/epilogue/threadblock/predicated_tile_iterator_strided_dgrad.h#L315-L318 This piece of code can be replaced by using fast divmod. The same can be applied to the store function below.

question
inactive-30d

Fixes #121965 This PR hopes to add support complex numbers in the scatter/gather related kernels. For brevity, I will only include `complex` for now as `complex`, for example, will be...

triaged
module: mkldnn
open source
Merged
Reverted
ciflow/trunk
release notes: python_frontend
topic: improvements

Fixes #117122 This PR adds the logic so that in the case of rank deficient matrices, it can fallback to an SVD backend for batched mode. A big thank you...

triaged
module: mkldnn
open source
release notes: sparse

A lot of APIs that are currently being used in the dnn module have been removed in cudnn 9. They were deprecated in 8. This PR updates said code accordingly...

category: gpu/cuda (contrib)
category: dnn

while trying to understand thrusts `complex` i noticed a bunch of useless outdated macros that can be removed

Fixes #111824 Currently it is the case that if the user specifies their group normalization to be of NHWC format, pytorch will default to NCHW tensors and convert. This conversion...

open source

Summary: 1) `insert` and `contains` only functions added for now 2) Put the data structure in a temporary `experimental` namespace to avoid having to change more areas of the code

Needs Review
topic: dynamic_map
type: improvement

Attempting to build CCCL without sccache installed will lead to `Notice: sccache is not available. Skipping...` . I personally run into troubles with devcontainers which is how I discovered this...