Ben Shoham Ofir
Ben Shoham Ofir
@JunXue-tech Did you solve the problem? If so, could you please share how?
@znsoftm How did you solve it?
@fancyerii Right. But how can I use zero3 and accelerate in the same time? Because my model is large (70b) and I need to split his weights to multi-gpus.
> > accelerate > > use accelerate config. you can take my configs below as a reference. I have two nodes with 16 total gpus. > > master node: >...
@fancyerii Without device_map and device? Did you run it with accelerate launch?
> Seconding this for both 120B and 20B, since their instruction-following is good but coding is usually subpar Hi, GPT-OSS shows interesting performance on an alternative function-calling benchmark (Tau-Bench), so...
@muellerzr ``` Traceback (most recent call last): File "/home/benshoho/projects/others/temp/hf_qa_zero_shot_pipeline.py", line 153, in outputs = model(**inputs) File "/home/benshoho/.conda/envs/accelerate_venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/benshoho/.conda/envs/accelerate_venv/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward...
@SunMarc Hi, I tried both and had the same error. Full logs for `python demo.py`: ``` ssh://benshoho@:22/home/benshoho/.conda/envs/accelerate_venv/bin/python -u /home/benshoho/projects/others/temp/demo.py Loading checkpoint shards: 100%|██████████████████| 3/3 [00:22= -sizes[i] && index < sizes[i]...
@muellerzr The output of `model.hf_device_map` is: ``` {'model.embed_tokens': 0, 'model.layers.0': 0, 'model.layers.1': 0, 'model.layers.2': 0, 'model.layers.3': 0, 'model.layers.4': 0, 'model.layers.5': 0, 'model.layers.6': 0, 'model.layers.7': 0, 'model.layers.8': 0, 'model.layers.9': 0, 'model.layers.10':...
@SunMarc Interesting. For only one gpu this code works for me also, but not with multi-gpu. Can you please share your environment details? (`pip freeze` or something else). Thank you!