Stas Bekman

Results 664 comments of Stas Bekman

As I'm not part of the Deepspeed team my vote won't count, but your benchmarks are super-impressive and I'd say definitely go for it. I will let @tjruwase to chime...

I was just pointed here by @mariosasko, meanwhile I found a workaround using `encode_example` like so: ``` from datasets import load_from_disk, Dataset DATASET_PATH = "/hf/m4-master/data/cm4/cm4-10000-v0.1" ds1 = load_from_disk(DATASET_PATH) ds2 =...

Hmm, interesting. If I create the dataset on the fly: ``` from datasets import load_from_disk, Dataset DATASET_PATH = "/hf/m4-master/data/cm4/cm4-10000-v0.1" ds1 = load_from_disk(DATASET_PATH) ds2 = Dataset.from_dict(mapping={k: [v]*2 for k, v in...

that would be very useful, thank you, @benfred! and additionally controlling how many generations of ancestry to descend might be useful as well. e.g. currently I have a need to...

btw, this whole `sudo`-requirement seems to be a 5.x linux kernel thing. On one HPC I had no problem attaching w/o `sudo`, but discovered it was 4.x kernel!

Thank you for the ptrace setting insight, @Jongy! Now I understand why the `sudo` was needed and that it had nothing to do with the kernel version! The problem with...

You redacted the type from the warning - was it `nn.Parameter`? If so it has been fixed here: https://github.com/microsoft/DeepSpeed/pull/2642 The fix will work for any `tensor.Torch` subclass. If it's another...

Fantastic. I'm glad you presented a concrete case - actually may be a new Issue would be better since mine is abstract and mentions several unrelated issues in one so...

It's very helpful to see the code - thank you, @yakazimir This definitely has nothing to do with PL and it's a pure DS end user model issue. OK, so...

Your code snippet is perfect, @yakazimir. I just can't see that custom object, but I assume that its tensors didn't have `requires_grad=False`. If they did then there is no problem....