Lifann

Results 12 issues of Lifann

## Checklist before submitting - [ x ] Did you read the [contributor guide](https://github.com/horovod/horovod/blob/master/CONTRIBUTING.md)? - [ ] Did you update the docs? - [ ] Did you write any tests...

wontfix

**Environment:** 1. Framework: (TensorFlow, Keras, PyTorch, MXNet) TensorFlow 2. Framework version: tensorflow-2.11 3. Horovod version: horovod-2.28.1 4. MPI version: openmpi-4.1.2a1-1.54103.x86_64 5. CUDA version: cuda-11.2 6. NCCL version: nccl-2.18 7. Python...

bug

### Background In the recommender system training, the user/item/history feature can be super large in production. Considering HPS as a multi-level cache, it can well store large sparse parameters, with...

question

…iables with different properties # Description Brief Description of the PR: When multiple variable share same name with different properties in graph mode, reusing `Variable` in python makes only the...

Draft code to apply hkv into tfra.

# Description Brief Description of the PR: Since dynamic embedding could be super large for memory limit. save and load with traditional TensorFlow checkpoint mechanism will use a lot of...

opt(insert-and-evict): thrust prefix_sum introduces `cudaMalloc` and `cudaFree` which make device sync. Replace it by cub API. The output of unit test case `insert-and-evict` is as follow: [ut_output.txt](https://github.com/NVIDIA-Merlin/HierarchicalKV/files/13648655/ut_output.txt)

## Motivation In recent years, the volumn of the model parameters has been piling up larger and larger. And the complexity of the state-of-art models are keeping being more complex...

enhancement

Here is the costs in microseconds of `dump_kernel` and `dump_kernel_v2` on both pinned host or device output on 2^24 capacity table with half of the contents are exported. The table...