Mayank Mishra
Mayank Mishra
But i am not sure why this is happening with only swiglu
from transformers.utils import WEIGHTS_NAME, WEIGHTS_INDEX_NAME, cached_path, hf_bucket_url ImportError: cannot import name 'cached_path' from 'transformers.utils fails with transformers 4.23.1 :( ```python path = snapshot_download( repo_id=model_name, allow_patterns=["*"], local_files_only=is_offline_mode(), cache_dir=os.getenv("TRANSFORMERS_CACHE", None) ) ```...
I have tested this @mrwyattii and it works fine. One thing to note is that: earlier I had to pass the path as: TRANSFORMERS_CACHE/models-bigscience-bloom and now it is just: TRANSFORMERS_CACHE....
I would say, after a few versions, we can drop support for older transformers maybe? I don't really think its needed since I think there is only a handful of...
Can we merge this?
I converted my server to flask and ran with gunicorn with 1 worker. This serializes all requests however
This is not possible. But you might want to take a look at QLoRA paper: https://github.com/artidoro/qlora
Hey, ds-inference is also doing world_size streams However, accelerate is only doing 1 stream since we are just using naive pipeline parallelism capability from accelerate. A more efficient approach for...
Try running in bf16 instead of fp32. Also, you can look at ONNX/TensorRT
8x 40G A100s should be enough for PEFT training of FLAN. Can you tell me what backend you are using? Are you not using DeepSpeed?