Oleg Sinavski comments

Results 14 comments of


                                            Oleg Sinavski

Resize and Fit fail to render some of images while using '--ignoreCache'

I’m getting the same issue - images are being processed to 0 bytes size. Somehow, it depends on the exact size and the format (jpg/png). Removing the `--ignoreCache` flag helped...

Changing from gpt to gpt2

Hello, I had the same problem and found that in my case I was generated token ids out of bounds of a tokenizer. Solution is to remove those: ``` y...

Comparison of Deepspeed Stage 1,2 and 3 vs DDP

Hello, I'm debugging the same issue. Since I'm working on VLMs, I found that the inclusion of the vision part (e.g. a `timm` model) leads to drastically slower convergence, but...

[BUG] Multi-GPU performance worse than single GPU when using optimizers with moving averages (e.g.: Adam)

Hello, I observe the same except way worse - it diverges when I bump a number of GPUs. I'm training BLIP-2 model from scratch in bf16 precision, stage 2. You...

PyCharm/WebStorm problem

The problem still persists: PyCharm 2016.1.4 Build #PY-145.1504, built on May 25, 2016 JRE: 1.8.0_76-release-b198 x86_64 JVM: OpenJDK 64-Bit Server VM by JetBrains s.r.o

"backward pass is invalid for module in evaluation mode" with deepspeed stage 3

Yes, I do actually have `HFModel.from_pretrained()`! Could you please elaborate why it's unavoidable? From the perspective of Lightning user, it does look like a regression in the behavior of 2.2:...

"backward pass is invalid for module in evaluation mode" with deepspeed stage 3

Ok, thank you for the explanation! Totally makes sense! (although I personally prefer "require_grad=False" for frozen bits). The last question - does it mean that that with stage 2, it...

`batch_sampler.batch_size` is None with deepspeed and `DataLoader(batch_size=None)`

Workaround is `strategy=pl.strategies.DeepSpeedStrategy(..., logging_batch_size_per_gpu=batch_size)` actually. Seems like this was broken in this commit: https://github.com/Lightning-AI/pytorch-lightning/commit/3518f9e09284099ddf623fe5ba9025a78b32397f I actually kinda agree that its better to have a hard crash, than warning. But maybe...

Loading List of List of Strings leads to nans

Thanks, that is indeed in the docs, so not a bug unfortunately.. @martindurant What do you mean it's not supported by pandas? After parquet is read (with arrow by default),...

Loading List of List of Strings leads to nans

Sounds good! So what do you think about hard-crashing instead of silently ignoring data in this case? I don't think ignoring a specific column would fly in production systems..