chongxiaoc
chongxiaoc
Call pyarrow.jemalloc_set_decay_ms(0). Signed-off-by: Chongxiao Cao ## Checklist before submitting - [ ] Did you read the [contributor guide](https://github.com/horovod/horovod/blob/master/CONTRIBUTING.md)? - [ ] Did you update the docs? - [ ] Did...
## Checklist before submitting - [ ] Did you read the [contributor guide](https://github.com/horovod/horovod/blob/master/CONTRIBUTING.md)? - [ ] Did you update the docs? - [ ] Did you write any tests to...
Current implementation of `make_petastorm_dataset` for tensorflow doesn't support multiple iterations:https://github.com/uber/petastorm/blob/7f37e8dde6ff1b13f055d22a6289e2de8bb5d473/petastorm/tf_utils.py#L370 It is recommended to set reader's `num_epochs` > 1 to support multiple iterations. This will cause possible duplication and drop...
I think BatchedDataLoader is dealing with the case files are larger than memory, so it streams rows from disk into memory, and shuffles data in the meanwhile. However, if in-memory...
Existing dataloader implementation is sequential, for common use case, it limits training speed if data shuffle part is slow. This happens when user defines a pretty large `shuffling queue capacity`,...
I'm using Python 3.7 and CUDA113. I tried `fbgemm-gpu` and `fbgemm-gpu-nightly` from pip, both versions failed in import: ``` [root@/ml-code/data/michelangelo/examples/torchrec_example/test #]pip3 show fbgemm-gpu Name: fbgemm-gpu Version: 0.1.2 Summary: UNKNOWN Home-page:...
**Describe the bug** ``` RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in...
Driver node and rank 0 use same path to save and load weights in ModelCheckpointCallback. It is possible driver node and rank 0 are not on the same machine, or...