Zach Mueller

Results 368 comments of Zach Mueller

Nope, you do not. That is also extremely valid (and why the non-yaml option exists, for situations where we need to wrap/call it separately and a yaml makes it complicated)

Hi all, we finally narrowed down the two sources of leakage in the implementation that we could improve. #2089 will fix this, reducing your memory by a _significant amount_. For...

@maxidl can you share your modified code? Curious what those exceptions are that exist for "no good reason"

Thanks @maxidl, as an approach here's what the team has decided we will do: 1. I'll put a PR in today that let's you *explicitly disable* the blocking behavior, and...

What kind of gpu setup are you using?

@DragonDRLI can you try specifying "gpu_ids" as "all" in your config? Check `vim ~/.cache/huggingface/accelerate/default_config.yaml` and do: ``` gpu_ids: all ``` (Notice no quotes)

@DragonDRLI can you try perhaps upgrading your torch version? (Doubtful, but having some issues recreating this). E.g.: `pip install light-the-torch; ltt install torch torchvision -U`

As Sylvain says, it's your dataset that's the issue. I would recommend ensuring that there are enough samples for at least 1 full batch between all your GPUs (so if...

@efsotr during my tests I'm able to have it all work properly, however you'll need to specify a new port in your config to launch on, which may stem your...